Outdated Version

You are viewing an older version of this section. View current production version.

EXPLAIN

Returns detailed information on how the query is executed.

EXPLAIN displays query execution operations as a tree, where the operator at the top is the last executed before returning the result to the client, and operators below are executed before their parents.

See the examples section below for descriptions of the most common types of information in EXPLAIN output.

EXPLAIN [EXTENDED | JSON] select_statement

Notes

  • select_statement is an arbitrary SELECT statement
  • EXPLAIN displays the plan MemSQL would generate if the query were reoptimized and recompiled now. If the query has already been run and a plan has already been saved in the plancache, that plan will be reused, even though it may differ from the plan displayed by EXPLAIN. You may run DROP … FROM PLANCACHE to force the query to be reoptimized and recompiled. Note that running ANALYZE will automatically drop and recompile query plans if data statistics have changed significantly.
  • In most cases when the optimizer makes a cost based decision, EXPLAIN output displays estimated statistics and costs.

Explain Modes

  • EXPLAIN outputs the explain information in a human readable format
  • EXPLAIN EXTENDED outputs more detailed information for the query plan in addition to the information displayed by EXPLAIN
  • EXPLAIN JSON outputs explain information in JSON format, which may be more machine readable

Additionally, the following EXPLAIN modes are provided for informational and debugging purposes only and may be changed at any time:

  • EXPLAIN MBC outputs MemSQL Byte Code generated for the query
  • EXPLAIN MPL outputs MemSQL Programming Language code generated for the query
  • EXPLAIN LLVM output LLVM byte code generated for the query
  • EXPLAIN ASM outputs assembly code generated for the query

EXPLAIN

memsql> explain select * from t;
+-----------------------------------+
| EXPLAIN                           |
+-----------------------------------+
| Project [t.id, t.a, t.b]          |
| Gather partitions:all             |
| Project [t.id, t.a, t.b]          |
| TableScan db1.t, PRIMARY KEY (id) |
+-----------------------------------+

EXPLAIN EXTENDED

memsql> explain extended select * from t;
+----------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                        |
+----------------------------------------------------------------------------------------------------------------+
| Project [t.id, t.a, t.b]                                                                                       |
| Gather partitions:all query:[SELECT `t`.`id` AS `id`, `t`.`a` AS `a`, `t`.`b` AS `b` FROM `db1_0`.`t` as `t` ] |
| Project [t.id, t.a, t.b]                                                                                       |
| TableScan db1.t, PRIMARY KEY (id)                                                                              |
+----------------------------------------------------------------------------------------------------------------+

EXPLAIN JSON

memsql> explain json select * from t;
----------------------------------------------
{
    "executor":"Project",
    "out":{
        "":"t.id",
        "":"t.a",
        "":"t.b"
    },
    "inputs":[
        {
            "executor":"Gather",
            "partitions":"all",
            "query":"SELECT `t`.`id` AS `id`, `t`.`a` AS `a`, `t`.`b` AS `b` FROM `db1_0`.`t` as `t` ",
            "inputs":[
                {
                    "executor":"Project",
                    "out":{
                        "":"t.id",
                        "":"t.a",
                        "":"t.b"
                    },
                    "subselects":[],
                    "inputs":[
                        {
                            "executor":"TableScan",
                            "db":"db1",
                            "table":"t",
                            "alias":"t",
                            "index":"PRIMARY KEY (id)",
                            "storage":"lf_skiplist",
                            "dir":"forward",
                            "inputs":[]
                        }
                    ]
                }
            ]
        }
    ]
}

EXPLAIN MBC

memsql> explain mbc select * from t;
----------------------------------------------
EXPLAIN
# explain mbc select * from t

Data:

                   static:  offset=0         size=125       align=1         ata="USING `\?`::`\?` AGGREGATOR_ID = \?, CONNECTION_ID = \? SELECT `t`.`id` AS `id`, `t`.`a` AS `a`, `t`.`b` AS `b` FROM `t` as `t` \x00"""

Function 0 <100000:CreateProjectState>:

     Locals:
         local      rvToReturn:  offset=0         size=8         align=8         type=int64_t

         local     rvToReceive:  offset=16        size=16        align=16        type=uint128_t
         param   hiddenRvParam:  offset=32        size=8         align=8         type={ state1: { RemoteElement: RemoteElement, a: { value: int32_t, opt: Opt, pack(4) }, b: { value: int32_t, opt: Opt, pack(4) }, : int64_t, id: int32_t } }*
|         local    state1Addr_1:  offset=40        size=8         align=8         type={ RemoteElement: RemoteElement, a: { value: int32_t, opt: Opt, pack(4) }, b: { value: int32_t, opt: Opt, pack(4) }, : int64_t, id: int32_t }*

     0x0000  Junk                                                      local=hiddenRvParam  i32=32
     0x000c  Lea                                                       local=&state1Addr_1  local=hiddenRvParam  i32=0
     0x001c  Call                                                      func=10 <row_0tmp1_internal_ctor>  local=state1Addr_1
     0x002c  Junk                                                      local=&state1Addr_1  i32=8
     0x0038  Literal8                                                  local=&rvToReturn  i64=1
     0x0048  Return

Function 1 <100000:CreateMatchedRowContext>:
                    ........

EXPLAIN LLVM

memsql> explain llvm select * from t;
----------------------------------------------
EXPLAIN

; Function Attrs: nounwind readnone uwtable
define void @MbcTypeDummyFunc(%struct.MbcTypeDummyStruct* nocapture) #2 {
  ret void
}


; Function Attrs: noinline noreturn nounwind
define linkonce_odr hidden void @__clang_call_terminate(i8*) #6 {
  %2 = tail call i8* @__cxa_begin_catch(i8* %0) #4
  tail call void @_ZSt9terminatev() #9
  unreachable
}


; Function Attrs: noreturn nounwind
define linkonce_odr void @"100000:RowToRowElement"(%struct.InterpVars* nocapture readnone, %struct.MemSqlClientValue* nocapture readnone, i8* nocapture readnone) #9 {
EntryBlock:
	call void @__assert_fail(i8* getelementptr inbounds ([6 x i8]* @.str770, i64 0, i64 0), i8* getelementptr inbounds ([45 x i8]* @.str710, i64 0, i64 0), i32 338, i8* getelementptr inbounds ([35 x i8]* @__PRETTY_FUNCTION__.iopAssert, i64 0, i64 0)) #9, !dbg !19
	unreachable, !dbg !19
                    ........

EXPLAIN ASM

memsql> explain asm select * from t;
----------------------------------------------
EXPLAIN
       .text
       .file   "/home/vagrant/memsql/interp_ops.bc"
       .section        .debug_info,"",@progbits
 .Lsection_info:
       .section        .debug_abbrev,"",@progbits
 .Lsection_abbrev:
       .section        .debug_line,"",@progbits
 .Lsection_line:
       .section        .debug_pubnames,"",@progbits
       .section        .debug_pubtypes,"",@progbits
       .section        .debug_str,"MS",@progbits,1
 .Linfo_string:
       .section        .debug_loc,"",@progbits
 .Lsection_debug_loc:
       .section        .debug_ranges,"",@progbits
 .Ldebug_range:
       .text
       .globl  MbcTypeDummyFunc
       .align  16, 0x90
       .type   MbcTypeDummyFunc,@function
 MbcTypeDummyFunc:
       .cfi_startproc
       pushq   %rbp
 .Ltmp0:
       .cfi_def_cfa_offset 16
 .Ltmp1:
       .cfi_offset %rbp, -16
       movq    %rsp, %rbp
.Ltmp2:
       .cfi_def_cfa_register %rbp
       popq    %rbp
       retq
                    ........

Explain Examples

This section describes and provides examples of some of the most common types of information shown in explain.

The following examples use the following two tables: t, a rowstore table with a primary key, and ct a columnstore table. These tables are both in database db1.

memsql> create table t(id int primary key, a int, b int, key(a));
memsql> create table ct(a int, b int, key(a) using clustered columnstore, shard key(a));

Table access methods

  • TableScan - scans every row in a table using an index
  • IndexSeek - navigates to a particular row using an index
  • IndexRangeScan - scans a range of rows using an index
  • ColumnStoreScan scans a table using a columnstore index
  • OrderedColumnStoreScan - scans a table using a columnstore index in index order
memsql> explain select * from t where t.a = 5;
+---------------------------------------------+
| EXPLAIN                                     |
+---------------------------------------------+
| Project [t.id, t.a, t.b]                    |
| Gather partitions:all                       |
| Project [t.id, t.a, t.b]                    |
| IndexRangeScan db.t, KEY a (a) scan:[a = 5] |
+---------------------------------------------+

memsql> explain select * from ct;
+---------------------------------------------------------------+
| EXPLAIN                                                       |
+---------------------------------------------------------------+
| Project [ct.a, ct.b]                                          |
| Gather partitions:all                                         |
| Project [ct.a, ct.b]                                          |
| ColumnStoreScan db1.ct, KEY a (a) USING CLUSTERED COLUMNSTORE |
+---------------------------------------------------------------+

Group by and aggregations

  • Aggregate - computes an aggregate.
  • HashGroupBy - uses a hash table to compute group by results
  • StreamingGroupBy - leverages the fact that the underlying operation produces rows in order to compute group by results. This advantage of StreamingGroupBy is that it only uses a constant amount of memory
  • DistinctRow - removes duplicate rows
memsql> explain select sum(id) from t;
+-----------------------------------------+
| EXPLAIN                                 |
+-----------------------------------------+
| Project [`sum(id)`]                     |
| Aggregate [SUM(`sum(id)`) AS `sum(id)`] |
| Gather partitions:all                   |
| Project [`sum(id)`]                     |
| Aggregate [SUM(t.id) AS `sum(id)`]      |
| TableScan db1.t, PRIMARY KEY (id)       |
+-----------------------------------------+

memsql> explain select sum(id) from t group by a+1;
+------------------------------------------------------------+
| EXPLAIN                                                    |
+------------------------------------------------------------+
| Project [`sum(id)`]                                        |
| HashGroupBy [SUM(`sum(id)`) AS `sum(id)`] groups:[t.a + 1] |
| Gather partitions:all                                      |
| Project [`sum(id)`, t.a + 1 AS op, t.a, 1 AS op_1]         |
| HashGroupBy [SUM(t.id) AS `sum(id)`] groups:[t.a + 1]      |
| TableScan db1.t, PRIMARY KEY (id)                          |
+------------------------------------------------------------+

Distributed data movement

  • Gather - collects all the results from the leaf nodes to the aggregator node. When a query can be routed to a single partition it has the attribute partitions:single. If Gather collects data from all the partitions the attribute is set to partitions:all. Queries that have partitions:single are called single partition queries. An advantage of single partition queries is that they can scale to much higher concurrency and throughput because they only need to execute on a single partition.
  • GatherMerge - collects ordered streams of rows from the leaves and merges them to output an ordered stream.
memsql> explain select * from t where id = 1;
+-------------------------------------------------+
| EXPLAIN                                         |
+-------------------------------------------------+
| Gather partitions:single                        |
| Project [t.id, t.a, t.b]                        |
| IndexSeek db1.t, PRIMARY KEY (id) scan:[id = 1] |
+-------------------------------------------------+

mysql> explain select * from t where id > 1;
+------------------------------------------------------+
| EXPLAIN                                              |
+------------------------------------------------------+
| Project [t.id, t.a, t.b]                             |
| Gather partitions:all                                |
| Project [t.id, t.a, t.b]                             |
| IndexRangeScan db1.t, PRIMARY KEY (id) scan:[id > 1] |
+------------------------------------------------------+

memsql> explain select * from t order by id;
+-----------------------------------+
| EXPLAIN                           |
+-----------------------------------+
| Project [t.id, t.a, t.b]          |
| GatherMerge [t.id] partitions:all |
| Project [t.id, t.a, t.b]          |
| TableScan db.t, PRIMARY KEY (id)  |
+-----------------------------------+
  • Repartition - redistributes a dataset to hash-partition it on a particular key
  • Broadcast - broadcasts a dataset to every node in a cluster
memsql> explain select * from t,ct where t.id = ct.b;
+-----------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                       |
+-----------------------------------------------------------------------------------------------+
| Project [t.id, t.a, t.b, r0.a_1, r0.b_2]                                                      |
| Gather partitions:all est_rows:1                                                              |
| Project [t.id, t.a, t.b, r0.a_1, r0.b_2] est_rows:1 est_select_cost:3                         |
| NestedLoopJoin                                                                                |
| |---IndexSeek db1.t, PRIMARY KEY (id) scan:[id = r0.b_2] est_table_rows:1 est_filtered:1      |
| TableScan r0 storage:list stream:no                                                           |
| Repartition [ct.a AS a_1, ct.b AS b_2] AS r0 shard_key:[b_2] est_rows:1                       |
| ColumnStoreScan db1.ct, KEY a (a) USING CLUSTERED COLUMNSTORE est_table_rows:1 est_filtered:1 |
+-----------------------------------------------------------------------------------------------+
  • ChoosePlan indicates that MemSQL will choose one of the listed plans at runtime based on cost estimates. estimate illustrates the statistics that are being estimated, but note that these SQL statements are not actually estimated. Instead, MemSQL uses index information to estimate these statistics.
memsql> explain select * from t where id > 5 and a > 5;
+-----------------------------------------------------------+
| EXPLAIN                                                   |
+-----------------------------------------------------------+
| Project [t.id, t.a, t.b]                                  |
| Gather partitions:all                                     |
| Project [t.id, t.a, t.b]                                  |
| ChoosePlan                                                |
| |   :estimate                                             |
| |       SELECT COUNT(*) AS cost FROM db1.t WHERE t.id > 5 |
| |       SELECT COUNT(*) AS cost FROM db1.t WHERE t.a > 5  |
| |---Filter [t.a > 5]                                      |
| |   IndexRangeScan db1.t, PRIMARY KEY (id) scan:[id > 5]  |
| +---Filter [t.id > 5]                                     |
|     IndexRangeScan db1.t, KEY a (a) scan:[a > 5]          |
+-----------------------------------------------------------+

Joins

  • HashJoin - performs a hash join: MemSQL builds a hash table from the results of the inner side of the join and probes into it while scanning the outer part of the join
  • MergeJoin - performs a merge join: MemSQL scans both inner and outer sides of the join at the same time and merges matching rows
  • NestedLoopJoin - performs a NestedLoop join: for every row on the outer side of the join MemSQL seeks or scans into the inner table to find all the matching rows
memsql> explain select * from t t1, t t2 where t1.id = t2.a;
+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
+------------------------------------------------------------------------------------------------+
| Project [t1.id, t1.a, t1.b, r0.id_1, r0.a_2, r0.b_3]                                           |
| Gather partitions:all est_rows:1                                                               |
| Project [t1.id, t1.a, t1.b, r0.id_1, r0.a_2, r0.b_3] est_rows:1 est_select_cost:3              |
| NestedLoopJoin                                                                                 |
| |---IndexSeek db1.t AS t1, PRIMARY KEY (id) scan:[id = r0.a_2] est_table_rows:1 est_filtered:1 |
| TableScan r0 storage:list stream:no                                                            |
| Repartition [t2.id AS id_1, t2.a AS a_2, t2.b AS b_3] AS r0 shard_key:[a_2] est_rows:1         |
| TableScan db1.t AS t2, PRIMARY KEY (id) est_table_rows:1 est_filtered:1                        |
+------------------------------------------------------------------------------------------------+

memsql> explain select * from ct t1, ct t2 where t1.a = t2.a;
+--------------------------------------------------------------------------------+
| EXPLAIN                                                                        |
+--------------------------------------------------------------------------------+
| Project [t1.a, t1.b, t2.a, t2.b]                                               |
| Gather partitions:all                                                          |
| Project [t1.a, t1.b, t2.a, t2.b]                                               |
| MergeJoin condition:[t2.a = t1.a]                                              |
| |---OrderedColumnStoreScan db1.ct AS t2, KEY a (a) USING CLUSTERED COLUMNSTORE |
| +---OrderedColumnStoreScan db1.ct AS t1, KEY a (a) USING CLUSTERED COLUMNSTORE |
+--------------------------------------------------------------------------------+

memsql> explain select * from t, ct where t.b = ct.b;
+-----------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                       |
+-----------------------------------------------------------------------------------------------+
| Project [r1.id, r1.a, r1.b, ct.a AS a_1, ct.b AS b_2]                                         |
| Gather partitions:all est_rows:1                                                              |
| Project [r1.id, r1.a, r1.b, ct.a AS a_1, ct.b AS b_2] est_rows:1 est_select_cost:4            |
| HashJoin [r1.b = ct.b]                                                                        |
| |---Broadcast [t.id, t.a, t.b] AS r1 est_rows:1                                               |
| |   TableScan db1.t, PRIMARY KEY (id) est_table_rows:1 est_filtered:1                         |
| ColumnStoreScan db1.ct, KEY a (a) USING CLUSTERED COLUMNSTORE est_table_rows:1 est_filtered:1 |
+-----------------------------------------------------------------------------------------------+