Outdated Version
You are viewing an older version of this section. View current production version.
Query Plan Operations
This topic describes the operations that a query plan may use. These operations are displayed when you use SingleStore DB Studio Visual Explain, run EXPLAIN, or run PROFILE to show a query plan. The examples in this topic use two tables : t
, a rowstore table with a primary key, and ct
a columnstore table. These tables are both in database db1
.
CREATE TABLE t(id INT PRIMARY KEY, a INT, b INT, KEY(a));
CREATE TABLE ct(a INT, b INT, KEY(a) USING CLUSTERED COLUMNSTORE, SHARD KEY(a));
Table access methods
TableScan
- scans every row in a table using an indexIndexSeek
- navigates to a particular row using an indexIndexRangeScan
scans a range of rows using an indexColumnStoreScan
- scans a columnstore tableOrderedColumnStoreScan
- scans a table using the columnstore sort key in key order
EXPLAIN SELECT * FROM t WHERE t.a = 5;
****
+---------------------------------------------+
| EXPLAIN |
+---------------------------------------------+
| Project [t.id, t.a, t.b] |
| Gather partitions:all |
| Project [t.id, t.a, t.b] |
| IndexRangeScan db.t, KEY a (a) scan:[a = 5] |
+---------------------------------------------+
EXPLAIN SELECT * FROM ct;
****
+---------------------------------------------------------------+
| EXPLAIN |
+---------------------------------------------------------------+
| Project [ct.a, ct.b] |
| Gather partitions:all |
| Project [ct.a, ct.b] |
| ColumnStoreScan db1.ct, KEY a (a) USING CLUSTERED COLUMNSTORE |
+---------------------------------------------------------------+
ColumnstoreFilter
table access method: Applies a filter to a columnstore table
The following example demonstrates the ColumnstoreFilter query operation, using the table articles
:
CREATE TABLE articles (
id INT UNSIGNED,
year int UNSIGNED,
title VARCHAR(200),
body TEXT,
SHARD KEY(id),
KEY (id) USING CLUSTERED COLUMNSTORE,
KEY (id) USING HASH,
KEY (title) USING HASH,
KEY (year) USING HASH);
The EXPLAIN
statement shows the ColumnStoreFilter operation with index
, because a hash index is used to apply the filter.
EXPLAIN SELECT * FROM articles WHERE title = 'Interesting title here';
****
+------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 |
| Project [articles.id, articles.year, articles.title, articles.body] |
| ColumnStoreFilter [articles.title = 'Interesting title here' index] |
| ColumnStoreScan d.articles, KEY id_2 (id) USING CLUSTERED COLUMNSTORE table_type:sharded_columnstore |
+------------------------------------------------------------------------------------------------------+
Group by and aggregations
Aggregate
- computes an aggregate.HashGroupBy
- uses a hash table to compute group by resultsStreamingGroupBy
- leverages the fact that the underlying operation produces rows in order to compute group by results. This advantage ofStreamingGroupBy
is that it only uses a constant amount of memoryDistinct
- removes duplicate rows
EXPLAIN SELECT SUM(id) FROM t;
****
+-----------------------------------------+
| EXPLAIN |
+-----------------------------------------+
| Project [`sum(id)`] |
| Aggregate [SUM(`sum(id)`) AS `sum(id)`] |
| Gather partitions:all |
| Project [`sum(id)`] |
| Aggregate [SUM(t.id) AS `sum(id)`] |
| TableScan db1.t, PRIMARY KEY (id) |
+-----------------------------------------+
EXPLAIN SELECT SUM(id) FROM t GROUP BY a+1;
****
+------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------+
| Project [`sum(id)`] |
| HashGroupBy [SUM(`sum(id)`) AS `sum(id)`] groups:[t.a + 1] |
| Gather partitions:all |
| Project [`sum(id)`, t.a + 1 AS op, t.a, 1 AS op_1] |
| HashGroupBy [SUM(t.id) AS `sum(id)`] groups:[t.a + 1] |
| TableScan db1.t, PRIMARY KEY (id) |
+------------------------------------------------------------+
Distributed data movement
Gather
- collects all the results from the leaf nodes to the aggregator node. When a query can be routed to a single partition it has the attributepartitions:single
. IfGather
collects data from all the partitions the attribute is set topartitions:all
. Queries that havepartitions:single
are called single partition queries. An advantage of single partition queries is that they can scale to much higher concurrency and throughput because they only need to execute on a single partition.GatherMerge
- collects ordered streams of rows from the leaves and merges them to output an ordered stream.
EXPLAIN SELECT * FROM t WHERE id = 1;
****
+-------------------------------------------------+
| EXPLAIN |
+-------------------------------------------------+
| Gather partitions:single |
| Project [t.id, t.a, t.b] |
| IndexSeek db1.t, PRIMARY KEY (id) scan:[id = 1] |
+-------------------------------------------------+
EXPLAIN SELECT * FROM t WHERE id > 1;
****
+------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------+
| Project [t.id, t.a, t.b] |
| Gather partitions:all |
| Project [t.id, t.a, t.b] |
| IndexRangeScan db1.t, PRIMARY KEY (id) scan:[id > 1] |
+------------------------------------------------------+
EXPLAIN SELECT * FROM t ORDER BY id;
****
+-----------------------------------+
| EXPLAIN |
+-----------------------------------+
| Project [t.id, t.a, t.b] |
| GatherMerge [t.id] partitions:all |
| Project [t.id, t.a, t.b] |
| TableScan db.t, PRIMARY KEY (id) |
+-----------------------------------+
Repartition
- redistributes a dataset to hash-partition it on a particular keyBroadcast
- broadcasts a dataset to every node in a cluster
EXPLAIN SELECT * FROM t,ct WHERE t.id = ct.b;
****
+-----------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------+
| Project [t.id, t.a, t.b, r0.a_1, r0.b_2] |
| Gather partitions:all est_rows:1 |
| Project [t.id, t.a, t.b, r0.a_1, r0.b_2] est_rows:1 est_select_cost:3 |
| NestedLoopJoin |
| |---IndexSeek db1.t, PRIMARY KEY (id) scan:[id = r0.b_2] est_table_rows:1 est_filtered:1 |
| TableScan r0 storage:list stream:no |
| Repartition [ct.a AS a_1, ct.b AS b_2] AS r0 shard_key:[b_2] est_rows:1 |
| ColumnStoreScan db1.ct, KEY a (a) USING CLUSTERED COLUMNSTORE est_table_rows:1 est_filtered:1 |
+-----------------------------------------------------------------------------------------------+
ChoosePlan
indicates that SingleStore DB will choose one of the listed plans at runtime based on cost estimates.estimate
illustrates the statistics that are being estimated, but note that these SQL statements are not actually estimated. Instead, SingleStore DB uses index information to estimate these statistics.
EXPLAIN SELECT * FROM t WHERE id > 5 AND a > 5;
****
+-----------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------+
| Project [t.id, t.a, t.b] |
| Gather partitions:all |
| Project [t.id, t.a, t.b] |
| ChoosePlan |
| | :estimate |
| | SELECT COUNT(*) AS cost FROM db1.t WHERE t.id > 5 |
| | SELECT COUNT(*) AS cost FROM db1.t WHERE t.a > 5 |
| |---Filter [t.a > 5] |
| | IndexRangeScan db1.t, PRIMARY KEY (id) scan:[id > 5] |
| +---Filter [t.id > 5] |
| IndexRangeScan db1.t, KEY a (a) scan:[a > 5] |
+-----------------------------------------------------------+
Joins
HashJoin
- performs a hash join: SingleStore DB builds a hash table from the results of the inner side of the join and probes into it while scanning the outer part of the joinMergeJoin
- performs a merge join: SingleStore DB scans both inner and outer sides of the join at the same time and merges matching rowsNestedLoopJoin
- performs a NestedLoop join: for every row on the outer side of the join SingleStore DB seeks or scans into the inner table to find all the matching rows
EXPLAIN SELECT * FROM t t1, t t2 WHERE t1.id = t2.a;
****
+------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------+
| Project [t1.id, t1.a, t1.b, r0.id_1, r0.a_2, r0.b_3] |
| Gather partitions:all est_rows:1 |
| Project [t1.id, t1.a, t1.b, r0.id_1, r0.a_2, r0.b_3] est_rows:1 est_select_cost:3 |
| NestedLoopJoin |
| |---IndexSeek db1.t AS t1, PRIMARY KEY (id) scan:[id = r0.a_2] est_table_rows:1 est_filtered:1 |
| TableScan r0 storage:list stream:no |
| Repartition [t2.id AS id_1, t2.a AS a_2, t2.b AS b_3] AS r0 shard_key:[a_2] est_rows:1 |
| TableScan db1.t AS t2, PRIMARY KEY (id) est_table_rows:1 est_filtered:1 |
+------------------------------------------------------------------------------------------------+
EXPLAIN SELECT * FROM ct t1, ct t2 WHERE t1.a = t2.a;
****
+--------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------+
| Project [t1.a, t1.b, t2.a, t2.b] |
| Gather partitions:all |
| Project [t1.a, t1.b, t2.a, t2.b] |
| MergeJoin condition:[t2.a = t1.a] |
| |---OrderedColumnStoreScan db1.ct AS t2, KEY a (a) USING CLUSTERED COLUMNSTORE |
| +---OrderedColumnStoreScan db1.ct AS t1, KEY a (a) USING CLUSTERED COLUMNSTORE |
+--------------------------------------------------------------------------------+
EXPLAIN SELECT * FROM t, ct WHERE t.b = ct.b;
****
+-----------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------+
| Project [r1.id, r1.a, r1.b, ct.a AS a_1, ct.b AS b_2] |
| Gather partitions:all est_rows:1 |
| Project [r1.id, r1.a, r1.b, ct.a AS a_1, ct.b AS b_2] est_rows:1 est_select_cost:4 |
| HashJoin [r1.b = ct.b] |
| |---Broadcast [t.id, t.a, t.b] AS r1 est_rows:1 |
| | TableScan db1.t, PRIMARY KEY (id) est_table_rows:1 est_filtered:1 |
| ColumnStoreScan db1.ct, KEY a (a) USING CLUSTERED COLUMNSTORE est_table_rows:1 est_filtered:1 |
+-----------------------------------------------------------------------------------------------+