Outdated Version

You are viewing an older version of this section. View current production version.

Memory Management

This page discusses common memory management topics. It includes the following sections:

Configuring Memory Limits

MemSQL has two system variables that control its memory usage. Both are measured in megabytes.

maximum_memory

MemSQL will not allocate more than maximum_memory megabytes of memory. If a SELECT query’s memory allocations put the memory usage by MemSQL over maximum_memory, query execution stops and an error message is generated.

Query compilations whose allocations exceed this limit will also terminate the server. See Code Generation for more information on query compilations. By default, maximum_memory is set to 90% of the physical memory on the host machine if no swap is enabled, and 100% of physical memory on the machine if swap is enabled.

maximum_table_memory

MemSQL will not allow writes to any table once the cumulative memory in use by all tables in MemSQL reaches maximum_table_memory (MemSQL will become read-only). SELECT and DELETE queries will still be allowed even once the limit is reached. UPDATE, INSERT , CREATE TABLE, ALTER TABLE, CREATE INDEX or DROP INDEX statements will fail with an error message once the limit has been reached. Query compilations are also disallowed once the limit is reached.

This setting is designed to allow SELECT queries to allocate temporary memory for sorting, hash group-by, and so on. The maximum_table_memory must be set to a value lower then maximum_memory. By default, maximum_table_memory is set to 90% of maximum_memory, which translates to about 80% of physical memory on the host machine.

If the maximum_table_memory limit has been reached, DELETE queries can still be executed to remove data from the table; however large DELETE queries may fail if the memory used by MemSQL reaches maximum_memory.

Caution should be taken as DELETE queries allocate extra memory to mark rows as deleted. For rowstore tables, this equates to roughly 40 + 8*number_of_indexes bytes per deleted row. For columnstore tables, the memory usage will be lower because of how rows are marked to be deleted (roughly num_rows_in_table/8 bytes if you delete a row in every segment file in the table).

If the table is narrow, such as containing a small number of int columns, DELETE queries will show up as a relatively large spike in memory usage compared to the size of the table.

The memory for a deleted row is reclaimed after the transaction commits and the memory is freed asynchronously by the garbage collector. For more information, see the DELETE reference topic.

Replicating databases will pause if memory use reaches maximum_table_memory while replicating data. If memory becomes available again - say some data is deleted - replication will automatically continue.

Maximum Table Memory Reached

If you see the error message below, the leaf at 'leafhost':leafport has reached maximum_table_memory, which is the maximum cumulative memory in use by all tables on a node. MemSQL will not allow writes to any table once maximum_table_memory is reached (MemSQL will become read-only).

Leaf Error (leafhost:leafport): Memory usage by MemSQL for tables (##### MB) has reached the value of 'maximum_table_memory' global variable (##### MB). This query cannot be executed.

You can resolve this issue with the steps below.

Check for Data Skew

First, determine which nodes are reaching their memory capacity. Do this by checking for data skew across your nodes. If skew is present (significantly higher memory consumption on specific nodes), you should re-evaluate your shard keys. If no skew is present, you should increase cluster memory or delete data. There are two ways to check for skew:

  1. Execute the following query to view per-table memory consumption and row counts in your MemSQL cluster for tables with more than 10000 rows:

    SELECT
        DATABASE_NAME,
        TABLE_NAME,
        MIN(ROWS),
        MAX(ROWS),
        FLOOR(AVG(ROWS)) AS avg_rows,
        ROUND(STDDEV(ROWS)/AVG(ROWS),3) AS row_skew,
        MIN(MEMORY_USE),
        MAX(MEMORY_USE),
        SUM(MEMORY_USE)/(1024*1024) AS total_memory_mb,
        FLOOR(AVG(MEMORY_USE)) AS avg_memory,
        ROUND(STDDEV(MEMORY_USE)/AVG(MEMORY_USE),3) AS memory_skew
    FROM INFORMATION_SCHEMA.TABLE_STATISTICS
    GROUP BY 1, 2
    HAVING SUM(ROWS) > 10000;
    
  2. With any version of MemSQL, you can manually compare maximum_table_memory and Alloc_table_memory (memory used by tables) on each node:

    mysql> SHOW VARIABLES LIKE "maximum_%";
    +----------------------+-------+
    | Variable_name        | Value |
    +----------------------+-------+
    | maximum_memory       | 48291 |
    | maximum_table_memory | 43462 |
    +----------------------+-------+
       
    mysql> SHOW STATUS EXTENDED LIKE "Alloc_table_memory";
    +--------------------+----------+
    | Variable_name      | Value    |
    +--------------------+----------+
    | Alloc_table_memory | 43462 MB |
    +--------------------+----------+
    1 row in set (0.01 sec)
    

If memory consumption is significantly higher on specific nodes, skew is present. If memory consumption is fairly uniform across leaves, no skew is present.

More Examples of Finding Data Skew

  • Show skew across all partitions for a single table. Fill in <table_name>.

    SELECT database_name,
           table_name,
           ordinal AS PARTITION_ID,
           rows,
           memory_use
    FROM   information_schema.table_statistics
    WHERE  table_name = '<table_name>';
    
  • Show per-partition skew for columns in a group by. Helps identify a potential bottleneck after a subquery with a group by. If skew factor is high, that indicates that some groups have very high cardinality and some have very low cardinality and can lead to long single-threaded operations. Fill in <table_name>.

    SELECT   ROUND(STDDEV(c)/AVG(c),3)*100 AS group_skew,
             PARTITION_ID()
    FROM     (
                      SELECT   col1,
                               col2,
                               ...,
                               coln,
                               count(*)
                      FROM     <table_name>
                      GROUP BY 1, 2) sub
    GROUP BY PARTITION_ID();
    
  • Show data distribution for a table if you were to reshard on another column or set of columns. Fill in <table_name> and <potential_shard_key>.

    SELECT   WITH(leaf_pushdown=true) SUM(c) rows,
             PARTITION_ID()                  partition_id
    FROM     (
                      SELECT   count(*) c
                      FROM     <table_name>
                      GROUP BY <potential_shard_key>) reshuffle
    GROUP BY PARTITION_ID();
    

If Skew is Present

MemSQL distributes data based on the SHARD KEY specified in a table’s schema. If no SHARD KEY is explicitly specified, the PRIMARY KEY is used as the default SHARD KEY. If the SHARD KEY shards on a low cardinality column, data will accumulate on certain nodes. To resolve this, you should export your data, modify your schema, and then reload your data. To export your data, see Exporting Data From MemSQL. After you backup your data, you can DROP TABLE low_cardinality_shard_key_table, and then CREATE TABLE with a higher cardinality SHARD KEY. When you reload the data, MemSQL will automatically shard your data based on the new SHARD KEY.

Example:

$ mysqldump -h 127.0.0.1 -u root my_db low_cardinality_shard_key_table --no-create-info > databackup.sql

memsql> DROP TABLE low_cardinality_shard_key_table;

memsql> source updated_shard_key_schema.sql;

If No Skew is Present

If nodes are reaching maximum_table_memory without skew, the cluster itself is nearing maximum memory capacity. To resolve this, you need to add capacity or delete data. To add capacity, see Administering a Cluster. If additional leaf nodes are not available, use the instructions below to delete data in batches. DELETE queries use 50 bytes of memory per record deleted until the DELETE commits, so the maximum_table_memory needs to be temporarily increased on all nodes during the DELETE.

On each leaf node in your cluster, do the following:

  1. Pause write workloads from an aggregator or your application.

  2. Check current maximum_memory and maximum_table_memory.

    memsql> select @@global.maximum_memory
    | @@global.maximum_memory |
    +-------------------------+
    |                   65536 |
    +-------------------------+
    1 row in set (0.00 sec)
       
    memsql> select @@global.maximum_table_memory
    +-------------------------------+
    | @@global.maximum_table_memory |
    +-------------------------------+
    |                         57927 |
    +-------------------------------+
    1 row in set (0.00 sec)
    
  3. Increase maximum_table_memory to 95% of maximum_memory. The default value is 90% of maximum_memory.

    memsql> set @@global.maximum_table_memory=62259;
    Query OK, 0 rows affected (0.00 sec)
    
  4. Execute small batches of DELETE queries. Limit to 100,000 records or less to reduce memory used until commit.

    memsql> DELETE FROM mytable limit 100000;
    Query OK, 0 rows affected (1.64 sec)
    
  5. Restore maximum_table_memory to original value.

    memsql> set @@global.maximum_table_memory=57927;
    Query OK, 0 rows affected (0.00 sec)
    

Linux Out of Memory Behavior

Linux will kill processes using a lot of memory when the amount of free memory on the system is too low. MemSQL is often the target process killed. To check if MemSQL was killed by Linux as a result of the system running out of memory, run:

$ dmesg | grep -i "out of memory"

If the server has rebooted since MemSQL was killed, you can find the logs from the last boot by running:

$ cat /var/log/kern.log

Linux can be configured to disable this behaviour by changing the setting of vm.overcommit_memory or vm.overcommit_ratio, but this is not recommended. It is safer to configure MemSQL to use less memory by setting maximum_memory to a lower value.