Outdated Version

You are viewing an older version of this section. View current production version.

COLLECT

The COLLECT function executes a SELECT statement having the QUERY type and stores the result as an array of records. After COLLECT executes you will typically iterate through the array to perform processing on its elements.

Syntax

variable_name = COLLECT(query_type_value)

variable_name = COLLECT(select_statement, QUERY(field_definition [, ...]))

  field_definition:
    field_name data_type [data_type_modifier]

Arguments

query_type_value

A query type value.

field_definition

The number of fields in the field_definition must be the same as the number of columns that the select_statement will return. The data_type(s) in the field_definition do not have to be the same as the data types of the columns that the select_statement will return. If the data types do not match, the :> operator will automatically be used to convert the column type to the field type. The field_name(s) in the field_definition also need not be the same as the field names in the select_statement.

field_name

The name of the field.

data_type

Any scalar-valued data type. For a complete list of data types, see the Data Types topic.

data_type_modifier

If not present or NULL, indicates that field_name may contain NULL values in the resulting array.

If NOT NULL, indicates that field_name may not contain NULL values in the resulting array.

select_statement

A string containing a SQL SELECT statement.

Remarks

Some database systems implement cursors, which allow programmers to iterate through each row of a recordset and perform an action on the row. MemSQL does not use the term “cursor.” However, you can achieve read-only cursor functionality by calling COLLECT and iterating over the values in the resulting array. The array may be processed forwards, backwards or in an arbitrary order.

Expressions of the form record.field for a record type value cannot be substituted directly into a SQL statement in a stored procedure. To include a record field’s value in an SQL statement, assign the value into a variable, and use the variable in the SQL statement instead. This approach is used in example 1 below, with the variables _id and _name.

Example 1: Using COLLECT with Static Queries

In the following example, COLLECT uses a query type variable, whose definition SELECT * from t is static.

DROP DATABASE IF EXISTS memsql_docs_example;
CREATE DATABASE memsql_docs_example;
USE memsql_docs_example;
CREATE TABLE t(id INT, name TEXT);
CREATE TABLE output_log(msg TEXT);

INSERT INTO t VALUES (1, 'red'), (2, 'green'), (3, 'blue');

DELIMITER //

CREATE OR REPLACE PROCEDURE p() AS
DECLARE  
  qry QUERY(id INT, name TEXT) = SELECT id, name FROM t;
  arr ARRAY(RECORD(id INT, name TEXT));
  _id INT;
  _name TEXT;
BEGIN
  arr = COLLECT(qry);
  FOR x in arr LOOP
    _id = x.id;
    _name = x.name;
    INSERT INTO output_log VALUES(CONCAT('[', _id, ', ', _name, ']'));
  END LOOP;
END //

DELIMITER ;

CALL p();

SELECT * FROM output_log ORDER BY msg;
****
+------------+
| msg        |
+------------+
| [1, red]   |
| [2, green] |
| [3, blue]  |
+------------+
Info

Instead of using variable assignments and verbose definitions, you can shorten the procedure definition (and body) by inserting the COLLECT function directly inside the loop as:

DELIMITER //

CREATE OR REPLACE PROCEDURE p() AS
DECLARE
  qry QUERY(id INT, name TEXT) = SELECT id, name FROM t;
BEGIN
  FOR x IN COLLECT(qry) LOOP
    INSERT INTO output_log VALUES(CONCAT('[', x.id, ', ', x.name, ']'));
  END LOOP;
END //

DELIMITER ;

The next example demonstrates the use of this style of code.

Example 2: Using COLLECT with Dynamic Queries (Method 1)

Here, you call COLLECT using a query type variable, whose value is populated by the TO_QUERY function. TO_QUERY allows you to write dynamic queries.

Warning

The TO_QUERY function does not parameterize dynamic SQL queries, which results in unique plans being added to the in-memory and on-disk plancache. If you use this function extensively, care should be taken to manage the size of your plancache. See Managing Plancache Memory Usage for more details on how to drop a plan from the in-memory plancache and how to delete the plancache files on disk.

If you need to define a query-type variable from a static query that use stored procedure parameters (e.g. SELECT t.a, t.b FROM t WHERE t.c = x with x being the input parameter value), you can do that directly instead of using TO_QUERY. This will parameterize any literal values and re-use the same plan from the plancache the next time. For an example on how to do this, see Binding Variables to a Query Type Value.


DROP DATABASE IF EXISTS memsql_docs_example;
CREATE DATABASE memsql_docs_example;
USE memsql_docs_example;
CREATE TABLE t1(id INT, name TEXT);
CREATE TABLE t2(id INT, name TEXT);
CREATE TABLE output_log(msg TEXT);

INSERT INTO t1 VALUES (1, 'red'), (2, 'green'), (3, 'blue');
INSERT INTO t2 VALUES (1, 'orange'), (2, 'blue'), (3, 'purple');

DELIMITER //

CREATE OR REPLACE PROCEDURE p (tbl TEXT) AS
DECLARE  
  qry QUERY(id INT, name TEXT) = TO_QUERY(CONCAT('SELECT id, name FROM ' , tbl));
BEGIN
  FOR x IN COLLECT(qry) LOOP
    INSERT INTO output_log VALUES(CONCAT('[', x.id, ', ', x.name, ']'));
  END LOOP;
END //

DELIMITER ;

CALL p('t2');

SELECT * FROM output_log ORDER BY msg;
****
+-------------+
| msg         |
+-------------+
| [1, orange] |
| [2, blue]   |
| [3, purple] |
+-------------+

Example 3: Using COLLECT with Dynamic Queries (Method 2)

Here, you specify a dynamic query directly in a COLLECT statement. The first argument is the query, in text form. The second argument is a QUERY type which specifies the output schema of the query, so it is known at the time the COLLECT expression is compiled.


DROP DATABASE IF EXISTS memsql_docs_example;
CREATE DATABASE memsql_docs_example;
use memsql_docs_example;
CREATE TABLE t1(id INT, name TEXT);
CREATE TABLE t2(id INT, name TEXT);
CREATE TABLE output_log(msg TEXT);

INSERT INTO t1 VALUES (1, 'red'), (2, 'green'), (3, 'blue');
INSERT INTO t2 VALUES (1, 'orange'), (2, 'blue'), (3, 'purple');

DELIMITER //
CREATE or REPLACE PROCEDURE p(tbl TEXT) AS
DECLARE  
BEGIN
  FOR x in COLLECT(CONCAT('SELECT id, name FROM ' , tbl), QUERY(id INT, name TEXT)) LOOP
    INSERT INTO output_log VALUES(CONCAT('[', x.id, ', ', x.name, ']'));
  END LOOP;
END //
DELIMITER ;

CALL p('t2');

SELECT * FROM output_log ORDER BY msg;
****
+-------------+
| msg         |
+-------------+
| [1, orange] |
| [2, blue]   |
| [3, purple] |
+-------------+

Related Topics

  • CREATE PROCEDURE: The CREATE PROCEDURE command creates a stored procedure.
  • QUERY: A data type representing a SELECT statement.
  • TO_QUERY: Converts a SQL string to a query type value.