You are viewing an older version of this section. View current production version.
Management View Reference
This topic contains reference information about each workload profiling management view.
Concepts and Terminology
Before querying these management views, it’s important to understand the concepts behind the terminology used in each view’s schema.
Task
MemSQL is instrumented to divide its execution time into tasks. Workload profiling management views in the information_schema database report the resource usage of sets of tasks. The views differ mainly in that they report on different sets over different intervals of time.
Task Examples
- A single query’s execution is instrumented with a task per partition, as well as a single task on the relevant aggregator.
- A single garbage collection pass is instrumented with a single task.
- A single replication pass against a database partition is instrumented with a single task.
- Tasks may or may not be
successful
. For example, a query task which commits a transaction is successful, but a query task which encounters rollback is not.
Activity
An activity is a set of identical tasks.
All tasks associated with the same node ID, database, partition, and name comprise the same activity. The tasks in an activity can be considered to share or be instances of an activity.
In practice, tasks share an activity when they instrument multiple repetitions of the same recurring work.
Every activity has a type of either Query, Database, or System. Tasks in a database activity are those which can be associated with an individual database, but not with any individual query. Tasks in a system activity can be associated with neither.
Every activity has a name and an aggregator activity name. We say that an activity is high-level if its aggregator activity name is equal to its name.
If an activity is not high-level, then its aggregator activity name is the name of the high-level activity to which it contributes.
In practice, we report aggregator activity name in order to link per-partition query activities with the unique aggregator activity associated with their query. The actual name of these activities may not be the same on all nodes.
Activity Examples
- All tasks instrumenting the same query against the same partition share an activity, which has Query type. Tasks instrumenting the query on different partitions have different activities.
- All garbage collection pass tasks share an activity, which has System type.
- All replication tasks against the same database partition share an activity, which has Database type.
- All activities associated with a query have the same aggregator activity name, which is the name of the unique high-level activity associated with the query on the aggregator node.
mv_activities
The mv_activities
view in the information_schema
database is a high-level summary profile of all tasks which ran recently on any node in the cluster.
It holds a row per high-level activity H. Each row describes the set of tasks which have run over a recent interval of time and which are either instances of the high level activity H, or else instances of an activity L whose aggregator activity name is the name of H. It reports the sum of the profiling statistics of each task in the set, as collected over the recent interval.
The effect of grouping the instances of multiple activities together is mainly to group all tasks associated with a query, across all partitions, into a single row.
mv_activities
determines recent resource use by computing the change in mv_activities_cumulative
over an interval of time. This interval is controlled by the value of the activities_delta_sleep_s
session variable.
mv_activities
may only be queried while connected to an aggregator node.
We recommend using this view to begin a performance investigation, as it is the most concise.
However, we recommend against the use of mv_activities
to compute the average latency or resource usage of a query. This is because its statistics include the latency and resource usage of currently running tasks, which will skew attempted average calculations. We recommend mv_finished_tasks
for this purpose instead.
Column name | Description |
---|---|
activity_type |
The type of the high-level activity. |
activity_name |
The name of the high-level activity. This column is often human-readable, but does not include the full query text for query tasks. Join with mv_queries for the query text. |
database_name |
The name of the database associated with the activity, or NULL if none could be assigned. |
run_count |
The number of instances which were running at the end of the interval. |
success_count |
The number of instances which completed successfully during the interval. |
failure_count |
The number of instances which completed unsuccessfully during the interval. |
SIMPLIFIED STATISTICS | This view contains all simplified statistics columns. |
mv_activities_extended
The mv_activities_extended
view in the information_schema
database is a detailed profile of all tasks which ran recently on any node in the cluster.
It holds a row per activity. Each row describes the instances of its activity which ran during a recent interval of time. It reports the sum of the profiling statistics of each task in the set, as collected over the recent interval.
mv_activities
determines recent resource use by computing the change in mv_activities_extended_cumulative
over an interval of time. This interval is controlled by the value of the activities_delta_sleep_s
session variable.
mv_activities_extended
may only be queried while connected to an aggregator node.
We recommend using this view to get fine-grained understanding of the system behavior described in mv_activities
. The main distinctions between mv_activities
and mv_activities_extended
are:
- For most queries,
mv_activities_extended
will report the resource usage of the query with at least a row per partition.mv_activities
aggregates these into a single row per query. mv_activities
reports simplified statistics, whilemv_activities_extended
reports extended statistics.
However, we recommend against the use of mv_activities_extended
to compute the average latency or resource usage of a query. This is because its statistics include the latency and resource usage of currently running tasks, which will skew attempted average calculations. We recommend mv_finished_tasks
for this purpose instead.
Column name | Description |
---|---|
node_id |
An ID equal to id in the row of mv_nodes describing the node on which the instances ran. |
activity_type |
The type of the activity. |
activity_name |
The name of the activity. This column is often human-readable, but does not include the full query text for query tasks. Join with mv_queries for the query text. |
aggregator_activity_name |
The aggregator activity name of the activity. |
database_name |
The name of the database associated with the activity, or NULL if none could be assigned. |
partition_id |
The unique ID of the database partition associated with the activity, or NULL if none could be assigned. |
run_count |
The number of instances which were running at the end of the interval. |
success_count |
The number of instances which completed successfully during the interval. |
failure_count |
The number of instances which completed unsuccessfully during the interval. |
EXTENDED STATISTICS | This view contains all extended statistics columns. |
mv_activities_cumulative
The mv_activities_cumulative
view in the information_schema
database is a high-level summary profile of completed and currently running tasks on all nodes of the cluster.
It holds a row per high-level activity H. Each row describes tasks which have completed or are currently running, and which are either instances of the high level activity H, or else instances of an activity L whose aggregator activity name is the name of H. It reports the sum of the profiling statistics of each task in the set, as collected over its lifetime.
The effect of grouping the instances of multiple activities together is mainly to group all tasks associated with a query, across all partitions, into a single row.
mv_activities_cumulative
may only be queried while connected to an aggregator node.
We recommend against extensive use of this table. It doesn’t accurately describe past system behavior because it doesn’t include all completed instances. This is because historical task statistics may be discarded by MemSQL at unspecified times.
Column name | Description |
---|---|
activity_type |
The type of the activities. |
activity_name |
An ID shared by all instances of the high level activity. This ID is often human-readable, but does not include the query text for query tasks. See mv_queries for the raw query text. |
database_name |
The name of the database associated with the activity, or NULL if none could be assigned. |
last_finished_timestamp |
The timestamp of the most recent completion of an instance of this activity. |
run_count |
The number of running instances of the activity. |
success_count |
The number of successfully completed instances of the activity. |
failure_count |
The number of unsuccessfully completed instances of the activity. |
SIMPLIFIED STATISTICS | This view contains all simplified statistics columns. |
mv_activities_extended_cumulative
The mv_activities_extended_cumulative
view in the information_schema
database is a detailed profile of completed and currently running tasks on all nodes of the cluster.
It holds a row per activity. Each row describes completed and currently running instances of its activity. It reports the sum of the profiling statistics of each instance, as collected over its lifetime.
mv_activities_extended_cumulative
may only be queried while connected to an aggregator node.
We recommend against extensive use of this table. It doesn’t accurately describe past system behavior because it doesn’t include all completed instances. This is because historical task statistics may be discarded by MemSQL at unspecified times.
Column name | Description |
---|---|
node_id |
An ID equal to id in the row of mv_nodes describing the node on which the activity’s instances have run. |
activity_type |
The type of the activity. |
activity_name |
An ID shared by all instances of the activity. This ID is often human-readable, but does not include the query text for query tasks. See mv_queries . |
aggregator_activity_name |
The aggregator_activity_name of the activity. |
database_name |
The name of the database associated with the activity, or NULL if none could be assigned. |
partition_id |
The unique ID of the database partition associated with the activity, or NULL if none could be assigned. |
last_finished_timestamp |
The timestamp of the most recent completion of an instance of this activity. |
run_count |
The number of running instances of this activity. |
success_count |
The number of successfully completed instances of this activity. |
failure_count |
The number of unsuccessfully completed instances of this activity. |
EXTENDED_STATISTICS | This view contains all extended statistics columns. |
mv_aggregated_column_usage
The mv_aggregated_column_usage
view in the information_schema
database provides a summary of mv_query_column_usage
, aggregating over the columns over all queries run.
Column name | Description |
---|---|
database_name |
The name of the database, or NULL if none could be assigned. |
table_name |
The name of the table to which the column belongs. |
column_name |
The aggregated column information is displayed for this column. |
equijoins |
The number of equijoins this column has been involved in over all queries run over the table. |
inquality_joins |
The number of inquality joins this column has been involved in over all queries run over the table. |
equality_preds |
The number of equality predicates in the WHERE or ON clause this column has been involved in all queries run on the table. |
inequality_preds |
The number of inequality predicates in the WHERE or ON clause this column has been involved in all queries run on the table. |
groupbys |
The number of GROUP BY clause in the column involved in all queries run on the table. |
reshuffles |
The number of reshuffle key in the column, involved in all queries run on the table. |
orderbys |
The number of ORDER BY clause in the column involved in all queries run on the table. |
outputs |
The number of times the column was returned as the result of a query. |
query_appearances |
The number of distinct queries the column appears in all queries run on the table, which is equivalent to the number of times the column appears in the first table. |
mv_finished_tasks
The mv_finished_tasks
view in the information_schema
database is a detailed profile of completed tasks on all nodes of the cluster.
It holds a row per activity. Each row describes completed instances of its activity. It reports the sum of the profiling statistics of each instance, as collected over its lifetime.
mv_finished_tasks
may only be queried while connected to an aggregator node.
We recommend against extensive use of this table. It doesn’t accurately describe past system behavior because it doesn’t include all completed instances. It will also hide the performance impact of long-running tasks which have yet to complete.
However, we do recommend using this table to estimate the average latency or resource usage of a query. This is because all other statistics tables include currently running tasks in their profiles, skewing attempted average calculations.
Column name | Description |
---|---|
node_id |
An ID equal to id in the row of mv_nodes describing the node on which the activity’s instances ran. |
activity_type |
The type of the activity. |
activity_name |
An ID shared by all instances of the activity. This ID is often human-readable, but does not include the query text for query tasks. Join with mv_queries for the query text. |
aggregator_activity_name |
The aggregator_activity_name of the activity. |
database_name |
The name of the database associated with the activity, or NULL if none could be assigned. |
partition_id |
The unique ID of the database partition associated with the activity, or NULL if none could be assigned. |
last_finished_timestamp |
The timestamp of the most recent completion of an instance of the activity. |
success_count |
The number of successfully completed instances of this activity. |
failure_count |
The number of unsuccessfully completed instances of this activity. |
EXTENDED STATISTICS | This view contains all extended statistics columns. |
mv_global_status
The mv_global_status
view in the information_schema
database contains information on the global server status variables that are set on the cluster’s nodes. Global variables which are not server status variables are found in the mv_global_variables
view.
Column name | Description |
---|---|
node_id |
The ID of the node. |
ip_addr |
The IP address of the node. |
port |
The port on which the node is listening. |
node_type |
MA , Leaf , or CA |
variable_name |
The name of the global variable. |
variable_value |
The value of the global variable. |
mv_global_variables
The mv_global_variables
view in the information_schema
database contains information about the global variables that are set on the cluster’s nodes. Global variables which are server status variables are not included and are found in the mv_global_status
view.
Column name | Description |
---|---|
node_id |
The ID of the node. |
ip_addr |
The IP address of the node. |
port |
The port on which the node is listening. |
node_type |
MA , Leaf , or CA |
variable_name |
The name of the global variable. |
variable_value |
The value of the global variable. |
mv_tasks
The mv_tasks
view in the information_schema
database is a detailed profile of all currently running tasks on all nodes of the cluster.
It holds a row per running task. Each row reports the profiling statistics of the task, as collected between the time it began and the current time.
mv_tasks
may only be queried while connected to an aggregator node.
We recommend against extensive use of this table. It will hide the performance impact of short-lived but frequent tasks, and exaggerate the impact of long-lived tasks.
Column name | Description |
---|---|
node_id |
An ID equal to id in the row of mv_nodes describing the node on which the task is running. |
activity_type |
The type of the task’s activity. |
activity_name |
An ID shared by all instances of the task’s activity. This ID is often human-readable, but does not include the query text for query tasks. |
aggregator_activity_name |
The aggregator_activity_name of the task’s activity. |
database_name |
The name of the database associated with the task’s activity, or NULL if none could be assigned. |
partition_id |
The unique ID of the database partition associated with the task’s activity, or NULL if none could be assigned. |
raw_task_name |
The full name associated with the task, or NULL if none could be assigned. In particular, this will hold the raw query text for query tasks. It differs from query_text in mv_nodes because it is fully parameterized. Note that this column is only available in mv_tasks. |
EXTENDED STATISTICS | This view contains all extended statistics columns. |
mv_nodes
The mv_nodes
view in the information_schema
database describes all MemSQL nodes in the cluster. It holds one row per node.
mv_nodes
may only be queried while connected to an aggregator node.
Column name | Description |
---|---|
id |
The ID of the node. This ID matches the node_id reported for tasks running on the node. |
ip_addr |
The IP address of the node. |
port |
The port on which the node is listening. |
type |
MA , Leaf , or CA . |
state |
Online , Offline , Detached , or Attaching . |
availability_group |
The availability group of the node. |
num_cpus |
The number of logical CPU cores on which the node may execute. |
memory_used_mb |
The value of total_server_memory on the node. |
max_memory_mb |
Maximum allowed memory for node |
mv_queries
The mv_queries
view in the information_schema
database describes queries seen by MemSQL. It holds a row for each query sent to the cluster by a user, as well as rows for artificial queries sent between nodes according to internal MemSQL logic.
Its purpose is to map the activity_name
of query activities to their associated de-parameterized query text.
mv_queries
may only be queried while connected to an aggregator node.
Column name | Description |
---|---|
activity_name |
The ID associated with the query. This will be the activity_name of all query tasks associated with the query. |
query_text |
The de-parameterized text of the query. |
mv_query_column_usage
The mv_query_column_usage
view in the information_schema
database provides information about the usage of columns by queries in the workload. Details about column that are used in joins, filters, group-bys, order-bys or reshuffles help to understand the frequency and cost of use of columns, and make better decisions about indexing and sharding of data.
Column name | Description |
---|---|
activity_name |
The ID associated with the query. This will be the activity_name of all query tasks associated with the query. |
database_name |
The name of the database, or NULL if none could be assigned. |
table_name |
The name of the table in the query. |
column_name |
The name of the column that exists in the table. |
equijoins |
The number of times this column was involved in an equijoin in the query. |
non_equijoins |
The number of times the column was involved in a non-equijoin in the query. |
equality_preds |
The number of times the column was involved in equality predicates in the WHERE or ON clause in the query. |
inequality_preds |
The number of times the column was involved in inequality predicates in the WHERE or ON clause in the query. |
groupbys |
The number of times the column was involved in a GROUP BY clause in the query. |
orderbys |
The number of times the column was involved in a ORDER BY clause in the query. |
reshuffles |
The number of times the column was member of reshuffle key in the query. |
outputs |
The number of times the column was returned as the result of the query. |