Cluster Expansion Steps (MemSQL Ops)

MemSQL Ops has been deprecated

Please follow this guide to learn how to migrate to SingleStore tools.

Info

This topic does not apply to MemSQL Helios.

If your MemSQL workload changes over time, MemSQL Ops make it easy to efficiently scale cluster size to meet increasing performance and sizing demands. You can add more nodes to your MemSQL cluster by following the steps below.

Alert

Be aware when adding leaf nodes:

Once a database has been created, you cannot change the number of partitions the database was originally created with.

This means:

  1. To maintain the current number of database partitions per leaf, you must export all data, delete the database, recreate the database and all tables, and load data back into the database’s tables.
  2. To maintain an equal number of partitions per leaf, the new total number of leaves must be divisible by the total number of partitions. For example, a four-leaf cluster has has been created with four partitions per leaf (16 partitions total). If only two leaf nodes are added, the total number of leaves is six; however, 16 / 6 is not an integer, meaning there cannot be an equal number of partitions on each leaf. This will cause data skew. If four additional leaves are added instead, 16 / 8, which means there can be an equal number of partitions on each leaf (two).

Also, if you keep the existing number of partitions, then you don’t have to do any reloading; only rebalancing would be required.

Before you begin

Confirm host machine and leaf configurations are the same across new and existing leaves. Configurations for sizing (RAM, cores) and OS tuning (ulimit settings, THP settings, etc.) should match to ensure uniform query performance and cluster health.

See System Requirements and Recommendations for more details.

It’s also important to consider what non-sync engine variables have been modified when increasing your cluster size. Any modified non-sync variable settings will not be automatically applied to new nodes when they are added to the cluster; therefore, when scaling your cluster, each new node non-sync variables must be updated with the same values as other nodes.

Also, generate a cluster report using the memsql-ops report command prior to adding nodes. This report will be useful for troubleshooting in case any issues arise during the cluster expansion process.

Add more leaf nodes to your cluster

  1. Check the memsql-ops version on all existing leaves (upgrade optional) to ensure you are installing the correct memsql-ops version on the new hosts. You can get the memsql-ops version across all hosts by running:

    memsql-ops agent-list
    
  2. Un-compress the ops tar file and run the install.sh script on each new node. View the Installation Guide for installation options.

    sudo ./install.sh
    
  3. Check and compare the settings.cnf file on a new host (that will become a leaf) and an old host (that is a leaf) to make sure they are the same.

  4. From each new host, set each new host to follow the primary ops agent. View the FOLLOW topic for more information.

    memsql-ops follow -h PRIMARY_AGENT_HOST
    
    • Check to see if the new leaves are correctly following the master. You should see the new hosts when running memsql-ops agent-list.
  5. Deploy SingleStore DB from master aggregator. When deploying from the Master Aggregator, you must specify the Agent ID of the host you want to deploy a MemSQL node on. To get the Agent ID of all hosts, use the agent-list command.

    • If the default port 3306 is already being used by a MemSQL node, you must specify another port with -P PORT.

    • If you are running in high availability (i.e. your Master Aggregator is set to --redundancy-level 2) you must specify an availability group for your new leaf nodes, --availability-group {1,2}.

    memsql-ops memsql-deploy -a AGENT_ID -r {leaf,aggregator,master}
    
  6. Verify the new nodes have been added successfully by running memsql-ops memsql-list.

  7. Update the root password from Master so all new nodes have a root password. When using MemSQL 6.5 and MemSQL Ops 6.0.11 or later, this is an online operation for both aggregators and leaves. Previous versions required you to remove and re-add each leaf in your cluster. All node memsql_id values can be found by running memsql-ops memsql-list on any host.

    memsql-ops memsql-update-root-password -p PASSWORD memsql_id
    
  8. On any aggregator, connect to the database and run SHOW LEAVES; to make sure all leaves were properly added.

  9. On any host, run memsql-ops memsql-list to make sure all new leaves are ops-aware. You should be able to see all new aggregator or leaf nodes added.

  10. On the master aggregator, enter the SQL interface and run REBALANCE PARTITIONS; on all databases.

  11. Check memory and disk usage in the MemSQL Ops user interface.

  12. On any aggregator, enter the SQL interface and output SHOW CLUSTER STATUS into a file you can examine to make sure partitions are evenly distributed.

  13. Check for data skew by running the following query in the SQL interface on any aggregator:

  SELECT
    DATABASE_NAME,
    TABLE_NAME,
    FLOOR(AVG(ROWS)) AS avg_rows,
    ROUND(STDDEV(ROWS)/AVG(ROWS),3) * 100 AS row_skew,
    FLOOR(AVG(MEMORY_USE)) AS avg_memory,
    ROUND(STDDEV(MEMORY_USE)/AVG(MEMORY_USE),3) * 100 AS memory_skew
    FROM INFORMATION_SCHEMA.TABLE_STATISTICS
    GROUP BY 1, 2
    HAVING SUM(ROWS) > 10000
    ORDER BY row_skew DESC;

For more information about detecting and measuring data skew, see the Data Skew topic.