You are viewing an older version of this section. View current production version.
Persistent Computed Columns
A computed column is a column defined by an expression that uses other columns in the table. MemSQL allows users to create persistent computed columns defined by an expression that combines other columns, constants, built-in functions, and operators. MemSQL’s computed columns are fully materialized and can be indexed like a standard column.
There are storage and performance tradeoffs to consider when using computed columns. At a high level, computed columns are a way to optimize computationally expensive read queries that use built-in functions or require additional data processing. They allow users to precompute values that would otherwise have to be computed as part of execution of a read query. However, computed columns also consume additional storage and require more computation on writes. Some example use cases for computed columns include:
- Parsing JSON blobs for improved read performance
- Precomputing a value using an expression that includes values from other columns in the table
- Extracting values from a column, such as a year from a timestamp or a domain from a URL
Persistent computed columns cannot be initialized with the NOT NULL
constraint.
Computed columns can be created as part of a CREATE TABLE statement, or can be added later using ALTER TABLE.
Computed Column Examples
The following statement creates a table t
with an INT
column a
and a persistent computed column b
that is defined using an expression that takes the value in a
and adds 1
to it.
CREATE TABLE t (a INT PRIMARY KEY, b AS a + 1 PERSISTED INT);
Inserting values into column a
will automatically populate column b
using the column definition expression.
INSERT INTO t (a) VALUES (1), (2);
SELECT * FROM t;
****
+---+------+
| a | b |
+---+------+
| 2 | 3 |
| 1 | 2 |
+---+------+
It is also possible to add a computed column to an existing table using ALTER TABLE.
ALTER TABLE t ADD COLUMN c AS a * 2 - b PERSISTED INT;
SELECT * FROM t;
****
+---+------+------+
| a | b | c |
+---+------+------+
| 2 | 3 | 1 |
| 1 | 2 | 0 |
+---+------+------+
Attempts to insert values directly into a computed column will result in an error, even if the values are “correct” with respect to the column definition.
INSERT INTO t (a, b, c) VALUES (3, 4, 2);
****
ERROR 1822 (HY000): Cannot insert into computed column 'b'
A computed column can be created using any expression that outputs a value that is a valid MemSQL data type, composed of operators and built-in functions. For example, computed columns allow you to extract values from a JSON blob so they can be scanned and read more quickly. However, note that aggregations, user-defined functions, and non-deterministic functions such as rand()
and now()
cannot be used in computed columns.
SELECT * FROM j;
****
+---+-------------------------------------------------+
| a | b |
+---+-------------------------------------------------+
| 1 | {"name":"Sandy Beach","occupation":"lifeguard"} |
+---+-------------------------------------------------+
ALTER TABLE j ADD COLUMN occupation AS b::$occupation PERSISTED TEXT;
SELECT * FROM j;
****
+---+-------------------------------------------------+------------+
| a | b | occupation |
+---+-------------------------------------------------+------------+
| 1 | {"name":"Sandy Beach","occupation":"lifeguard"} | lifeguard |
+---+-------------------------------------------------+------------+
Modifying values in a column used to define a computed column will also update the affected values in the computed column.
UPDATE j SET b::$occupation = 'circus clown';
SELECT * FROM j;
****
+---+----------------------------------------------------+--------------+
| a | b | occupation |
+---+----------------------------------------------------+--------------+
| 1 | {"name":"Sandy Beach","occupation":"circus clown"} | circus clown |
+---+----------------------------------------------------+--------------+