Outdated Version

You are viewing an older version of this section. View current production version.

Using Persistent Computed Columns min read


A computed column is a column defined by an expression that uses other columns in the table. SingleStore DB allows users to create persistent computed columns defined by an expression that combines other columns, constants, built-in functions, and operators. SingleStore DB’s computed columns are fully materialized and can be indexed like a standard column.

There are storage and performance tradeoffs to consider when using computed columns. At a high level, computed columns are a way to optimize computationally expensive read queries that use built-in functions or require additional data processing. They allow users to precompute values that would otherwise have to be computed as part of execution of a read query. However, computed columns also consume additional storage and require more computation on writes. Some example use cases for computed columns include:

  • Parsing JSON blobs for improved read performance
  • Precomputing a value using an expression that includes values from other columns in the table
  • Extracting values from a column, such as a year from a timestamp or a domain from a URL
  • Pre-materializing common expressions in queries allowing them to be used in high-order operations, such as segment elimination or encoded GROUP BYs
  • Using computed columns as sort keys, such as a timestamp type computed column with coarser precision to make subsequent columns useful in a sort key (see Is (insert_datetime6, region_id) a good columnstore key?)
Info

Persistent computed columns cannot be initialized with the NOT NULL constraint.

Computed columns can be created as part of a CREATE TABLE statement, or can be added later using ALTER TABLE.

Computed Column Examples

The following statement creates a table t with an INT column a and a persistent computed column b that is defined using an expression that takes the value in a and adds 1 to it.

CREATE TABLE t (a INT PRIMARY KEY, b AS a + 1 PERSISTED INT);

Inserting values into column a will automatically populate column b using the column definition expression.

INSERT INTO t (a) VALUES (1), (2);

SELECT * FROM t;
****
+---+------+
| a | b    |
+---+------+
| 2 |    3 |
| 1 |    2 |
+---+------+

It is also possible to add a computed column to an existing table using ALTER TABLE.

ALTER TABLE t ADD COLUMN c AS a * 2 - b PERSISTED INT;

SELECT * FROM t;
****
+---+------+------+
| a | b    | c    |
+---+------+------+
| 2 |    3 |    1 |
| 1 |    2 |    0 |
+---+------+------+

Attempts to insert values directly into a computed column will result in an error, even if the values are “correct” with respect to the column definition.

INSERT INTO t (a, b, c) VALUES (3, 4, 2);
****
ERROR 1822 (HY000): Cannot insert into computed column 'b'

A computed column can be created using any expression that outputs a value that is a valid SingleStore DB data type, composed of operators and built-in functions. For example, computed columns allow you to extract values from a JSON blob so they can be scanned and read more quickly. However, note that aggregations, user-defined functions, and non-deterministic functions such as rand() and now() cannot be used in computed columns.

SELECT * FROM j;
****
+---+-------------------------------------------------+
| a | b                                               |
+---+-------------------------------------------------+
| 1 | {"name":"Sandy Beach","occupation":"lifeguard"} |
+---+-------------------------------------------------+
ALTER TABLE j ADD COLUMN occupation AS b::$occupation PERSISTED TEXT;

SELECT  * FROM j;
****
+---+-------------------------------------------------+------------+
| a | b                                               | occupation |
+---+-------------------------------------------------+------------+
| 1 | {"name":"Sandy Beach","occupation":"lifeguard"} | lifeguard  |
+---+-------------------------------------------------+------------+

Modifying values in a column used to define a computed column will also update the affected values in the computed column.

UPDATE j SET b::$occupation = 'circus clown';

SELECT * FROM j;
****
+---+----------------------------------------------------+--------------+
| a | b                                                  | occupation   |
+---+----------------------------------------------------+--------------+
| 1 | {"name":"Sandy Beach","occupation":"circus clown"} | circus clown |
+---+----------------------------------------------------+--------------+