Outdated Version

You are viewing an older version of this section. View current production version.

Management View Reference

This topic contains reference information about each workload profiling management view.

Concepts and Terminology

Before querying these management views, it’s important to understand the concepts behind the terminology used in each view’s schema.

Task

MemSQL is instrumented to divide its execution time into tasks. Workload profiling management views in the information_schema database report the resource usage of sets of tasks. The views differ mainly in that they report on different sets over different intervals of time.

Task Examples

A single query’s execution is instrumented with a task per partition, as well as a single task on the relevant aggregator.
A single garbage collection pass is instrumented with a single task.
A single replication pass against a database partition is instrumented with a single task.
Tasks may or may not be successful. For example, a query task which commits a transaction is successful, but a query task which encounters rollback is not.

Activity

An activity is a set of identical tasks.

All tasks associated with the same node ID, database, partition, and name comprise the same activity. The tasks in an activity can be considered to share or be instances of an activity.

In practice, tasks share an activity when they instrument multiple repetitions of the same recurring work.

Every activity has a type of either Query, Database, or System. Tasks in a database activity are those which can be associated with an individual database, but not with any individual query. Tasks in a system activity can be associated with neither.

Every activity has a name and an aggregator activity name. We say that an activity is high-level if its aggregator activity name is equal to its name.

If an activity is not high-level, then its aggregator activity name is the name of the high-level activity to which it contributes.

In practice, we report aggregator activity name in order to link per-partition query activities with the unique aggregator activity associated with their query. The actual name of these activities may not be the same on all nodes.

Activity Examples

All tasks instrumenting the same query against the same partition share an activity, which has Query type. Tasks instrumenting the query on different partitions have different activities.
All garbage collection pass tasks share an activity, which has System type.
All replication tasks against the same database partition share an activity, which has Database type.
All activities associated with a query have the same aggregator activity name, which is the name of the unique high-level activity associated with the query on the aggregator node.

mv_activities

The mv_activities view in the information_schema database is a high-level summary profile of all tasks which ran recently on any node in the cluster.

It holds a row per high-level activity H. Each row describes the set of tasks which have run over a recent interval of time and which are either instances of the high level activity H, or else instances of an activity L whose aggregator activity name is the name of H. It reports the sum of the profiling statistics of each task in the set, as collected over the recent interval.

The effect of grouping the instances of multiple activities together is mainly to group all tasks associated with a query, across all partitions, into a single row.

mv_activities determines recent resource use by computing the change in mv_activities_cumulative over an interval of time. This interval is controlled by the value of the activities_delta_sleep_s session variable.

mv_activities may only be queried while connected to an aggregator node.

We recommend using this view to begin a performance investigation, as it is the most concise.

However, we recommend against the use of mv_activities to compute the average latency or resource usage of a query. This is because its statistics include the latency and resource usage of currently running tasks, which will skew attempted average calculations. We recommend mv_finished_tasks for this purpose instead.

Column name	Description
`activity_type`	The type of the high-level activity.
`activity_name`	The name of the high-level activity. This column is often human-readable, but does not include the full query text for query tasks. Join with `mv_queries` for the query text.
`database_name`	The name of the database associated with the activity, or `NULL` if none could be assigned.
`run_count`	The number of instances which were running at the end of the interval.
`success_count`	The number of instances which completed successfully during the interval.
`failure_count`	The number of instances which completed unsuccessfully during the interval.
SIMPLIFIED STATISTICS	This view contains all simplified statistics columns.

mv_activities_extended

The mv_activities_extended view in the information_schema database is a detailed profile of all tasks which ran recently on any node in the cluster.

It holds a row per activity. Each row describes the instances of its activity which ran during a recent interval of time. It reports the sum of the profiling statistics of each task in the set, as collected over the recent interval.

mv_activities determines recent resource use by computing the change in mv_activities_extended_cumulative over an interval of time. This interval is controlled by the value of the activities_delta_sleep_s session variable.

mv_activities_extended may only be queried while connected to an aggregator node.

We recommend using this view to get fine-grained understanding of the system behavior described in mv_activities. The main distinctions between mv_activities and mv_activities_extended are:

For most queries, mv_activities_extended will report the resource usage of the query with at least a row per partition. mv_activities aggregates these into a single row per query.
mv_activities reports simplified statistics, while mv_activities_extended reports extended statistics.

However, we recommend against the use of mv_activities_extended to compute the average latency or resource usage of a query. This is because its statistics include the latency and resource usage of currently running tasks, which will skew attempted average calculations. We recommend mv_finished_tasks for this purpose instead.

Column name	Description
`node_id`	An ID equal to `id` in the row of `mv_nodes` describing the node on which the instances ran.
`activity_type`	The type of the activity.
`activity_name`	The name of the activity. This column is often human-readable, but does not include the full query text for query tasks. Join with `mv_queries` for the query text.
`aggregator_activity_name`	The aggregator activity name of the activity.
`database_name`	The name of the database associated with the activity, or `NULL` if none could be assigned.
`partition_id`	The unique ID of the database partition associated with the activity, or `NULL` if none could be assigned.
`run_count`	The number of instances which were running at the end of the interval.
`success_count`	The number of instances which completed successfully during the interval.
`failure_count`	The number of instances which completed unsuccessfully during the interval.
EXTENDED STATISTICS	This view contains all extended statistics columns.

mv_activities_cumulative

The mv_activities_cumulative view in the information_schema database is a high-level summary profile of completed and currently running tasks on all nodes of the cluster.

It holds a row per high-level activity H. Each row describes tasks which have completed or are currently running, and which are either instances of the high level activity H, or else instances of an activity L whose aggregator activity name is the name of H. It reports the sum of the profiling statistics of each task in the set, as collected over its lifetime.

The effect of grouping the instances of multiple activities together is mainly to group all tasks associated with a query, across all partitions, into a single row.

mv_activities_cumulative may only be queried while connected to an aggregator node.

We recommend against extensive use of this table. It doesn’t accurately describe past system behavior because it doesn’t include all completed instances. This is because historical task statistics may be discarded by MemSQL at unspecified times.

Column name	Description
`activity_type`	The `type` of the activities.
`activity_name`	An ID shared by all instances of the high level activity. This ID is often human-readable, but does not include the query text for query tasks. See `mv_queries` for the raw query text.
`database_name`	The name of the database associated with the activity, or `NULL` if none could be assigned.
`last_finished_timestamp`	The timestamp of the most recent completion of an instance of this activity.
`run_count`	The number of running instances of the activity.
`success_count`	The number of successfully completed instances of the activity.
`failure_count`	The number of unsuccessfully completed instances of the activity.
SIMPLIFIED STATISTICS	This view contains all simplified statistics columns.

mv_activities_extended_cumulative

The mv_activities_extended_cumulative view in the information_schema database is a detailed profile of completed and currently running tasks on all nodes of the cluster.

It holds a row per activity. Each row describes completed and currently running instances of its activity. It reports the sum of the profiling statistics of each instance, as collected over its lifetime.

mv_activities_extended_cumulative may only be queried while connected to an aggregator node.

Column name	Description
`node_id`	An ID equal to `id` in the row of `mv_nodes` describing the node on which the activity’s instances have run.
`activity_type`	The `type` of the activity.
`activity_name`	An ID shared by all instances of the activity. This ID is often human-readable, but does not include the query text for query tasks. See `mv_queries`.
`aggregator_activity_name`	The `aggregator_activity_name` of the activity.
`database_name`	The name of the database associated with the activity, or `NULL` if none could be assigned.
`partition_id`	The unique ID of the database partition associated with the activity, or `NULL` if none could be assigned.
`last_finished_timestamp`	The timestamp of the most recent completion of an instance of this activity.
`run_count`	The number of running instances of this activity.
`success_count`	The number of successfully completed instances of this activity.
`failure_count`	The number of unsuccessfully completed instances of this activity.
EXTENDED_STATISTICS	This view contains all extended statistics columns.

mv_aggregated_column_usage

The mv_aggregated_column_usage view in the information_schema database provides a summary of mv_query_column_usage, aggregating over the columns over all queries run.

Column name	Description
`database_name`	The name of the database, or `NULL` if none could be assigned.
`table_name`	The name of the table to which the column belongs.
`column_name`	The aggregated column information is displayed for this column.
`equijoins`	The number of equijoins this column has been involved in over all queries run over the table.
`inquality_joins`	The number of inquality joins this column has been involved in over all queries run over the table.
`equality_preds`	The number of equality predicates in the `WHERE` or `ON` clause this column has been involved in all queries run on the table.
`inequality_preds`	The number of inequality predicates in the `WHERE` or `ON` clause this column has been involved in all queries run on the table.
`groupbys`	The number of `GROUP BY` clause in the column involved in all queries run on the table.
`reshuffles`	The number of reshuffle key in the column, involved in all queries run on the table.
`orderbys`	The number of `ORDER BY` clause in the column involved in all queries run on the table.
`outputs`	The number of times the column was returned as the result of a query.
`query_appearances`	The number of distinct queries the column appears in all queries run on the table, which is equivalent to the number of times the column appears in the first table.

mv_finished_tasks

The mv_finished_tasks view in the information_schema database is a detailed profile of completed tasks on all nodes of the cluster.

It holds a row per activity. Each row describes completed instances of its activity. It reports the sum of the profiling statistics of each instance, as collected over its lifetime.

mv_finished_tasks may only be queried while connected to an aggregator node.

We recommend against extensive use of this table. It doesn’t accurately describe past system behavior because it doesn’t include all completed instances. It will also hide the performance impact of long-running tasks which have yet to complete.

However, we do recommend using this table to estimate the average latency or resource usage of a query. This is because all other statistics tables include currently running tasks in their profiles, skewing attempted average calculations.

Column name	Description
`node_id`	An ID equal to `id` in the row of `mv_nodes` describing the node on which the activity’s instances ran.
`activity_type`	The `type` of the activity.
`activity_name`	An ID shared by all instances of the activity. This ID is often human-readable, but does not include the query text for query tasks. Join with `mv_queries` for the query text.
`aggregator_activity_name`	The `aggregator_activity_name` of the activity.
`database_name`	The name of the database associated with the activity, or `NULL` if none could be assigned.
`partition_id`	The unique ID of the database partition associated with the activity, or `NULL` if none could be assigned.
`last_finished_timestamp`	The timestamp of the most recent completion of an instance of the activity.
`success_count`	The number of successfully completed instances of this activity.
`failure_count`	The number of unsuccessfully completed instances of this activity.
EXTENDED STATISTICS	This view contains all extended statistics columns.

mv_global_status

The mv_global_status view in the information_schema database contains information on the global server status variables that are set on the cluster’s nodes. Global variables which are not server status variables are found in the mv_global_variables view.

Column name	Description
`node_id`	The ID of the node.
`ip_addr`	The IP address of the node.
`port`	The port on which the node is listening.
`node_type`	`MA`, `Leaf`, or `CA`
`variable_name`	The name of the global variable.
`variable_value`	The value of the global variable.

mv_global_variables

The mv_global_variables view in the information_schema database contains information about the global variables that are set on the cluster’s nodes. Global variables which are server status variables are not included and are found in the mv_global_status view.

Column name	Description
`node_id`	The ID of the node.
`ip_addr`	The IP address of the node.
`port`	The port on which the node is listening.
`node_type`	`MA`, `Leaf`, or `CA`
`variable_name`	The name of the global variable.
`variable_value`	The value of the global variable.

mv_tasks

The mv_tasks view in the information_schema database is a detailed profile of all currently running tasks on all nodes of the cluster.

It holds a row per running task. Each row reports the profiling statistics of the task, as collected between the time it began and the current time.

mv_tasks may only be queried while connected to an aggregator node.

We recommend against extensive use of this table. It will hide the performance impact of short-lived but frequent tasks, and exaggerate the impact of long-lived tasks.

Column name	Description
`node_id`	An ID equal to `id` in the row of `mv_nodes` describing the node on which the task is running.
`activity_type`	The `type` of the task’s activity.
`activity_name`	An ID shared by all instances of the task’s activity. This ID is often human-readable, but does not include the query text for query tasks.
`aggregator_activity_name`	The `aggregator_activity_name` of the task’s activity.
`database_name`	The name of the database associated with the task’s activity, or `NULL` if none could be assigned.
`partition_id`	The unique ID of the database partition associated with the task’s activity, or `NULL` if none could be assigned.
`raw_task_name`	The full name associated with the task, or `NULL` if none could be assigned. In particular, this will hold the raw query text for query tasks. It differs from `query_text` in `mv_nodes` because it is fully parameterized. Note that this column is only available in mv_tasks.
EXTENDED STATISTICS	This view contains all extended statistics columns.

mv_nodes

The mv_nodes view in the information_schema database describes all MemSQL nodes in the cluster. It holds one row per node.

mv_nodes may only be queried while connected to an aggregator node.

Column name	Description
`id`	The ID of the node. This ID matches the `node_id` reported for tasks running on the node.
`ip_addr`	The IP address of the node.
`port`	The port on which the node is listening.
`type`	`MA`, `Leaf`, or `CA`.
`state`	`Online`, `Offline`, `Detached`, or `Attaching`.
`availability_group`	The availability group of the node.
`num_cpus`	The number of logical CPU cores on which the node may execute.
`memory_used_mb`	The value of `total_server_memory` on the node.
`max_memory_mb`	Maximum allowed memory for node

mv_queries

The mv_queries view in the information_schema database describes queries seen by MemSQL. It holds a row for each query sent to the cluster by a user, as well as rows for artificial queries sent between nodes according to internal MemSQL logic.

Its purpose is to map the activity_name of query activities to their associated de-parameterized query text.

mv_queries may only be queried while connected to an aggregator node.

Column name	Description
`activity_name`	The ID associated with the query. This will be the `activity_name` of all query tasks associated with the query.
`query_text`	The de-parameterized text of the query.

mv_query_column_usage

The mv_query_column_usage view in the information_schema database provides information about the usage of columns by queries in the workload. Details about column that are used in joins, filters, group-bys, order-bys or reshuffles help to understand the frequency and cost of use of columns, and make better decisions about indexing and sharding of data.

Column name	Description
`activity_name`	The ID associated with the query. This will be the `activity_name` of all query tasks associated with the query.
`database_name`	The name of the database, or `NULL` if none could be assigned.
`table_name`	The name of the table in the query.
`column_name`	The name of the column that exists in the table.
`equijoins`	The number of times this column was involved in an equijoin in the query.
`non_equijoins`	The number of times the column was involved in a non-equijoin in the query.
`equality_preds`	The number of times the column was involved in equality predicates in the `WHERE` or `ON` clause in the query.
`inequality_preds`	The number of times the column was involved in inequality predicates in the `WHERE` or `ON` clause in the query.
`groupbys`	The number of times the column was involved in a `GROUP BY` clause in the query.
`orderbys`	The number of times the column was involved in a `ORDER BY` clause in the query.
`reshuffles`	The number of times the column was member of reshuffle key in the query.
`outputs`	The number of times the column was returned as the result of the query.

Was this Article Helpful?

Questions?

Ask the Community, find solutions in the Troubleshooting Overview, get Self-Paced Training or learn more about our Support Plans.