Please follow this guide to learn how to migrate to SingleStore tools.
This topic does not apply to MemSQL Helios.
MemSQL 7.0 uses more disk space than MemSQL 6.x for columnstore tables and the new replication mechanism, and introduces additional metadata for columnstore tables to improve performance for concurrent updates/deletes and selective filtering.
Specifically:
-
An
auto-incr
value for every row has been introduced, which is persisted in blobs and requires approximately 8 bytes per columnstore row. -
Database logs are now pre-allocated in MemSQL 7.0:
-
Each
system
database requires an additional 32MB of disk space for pre-allocated logs. This equates to an additional 96MB per node. -
Each reference (
ref
) database requires two pre-allocated log files. The size of each log file is 64MB, for a total of 128MB of disk space perref
database per node. See the engine variables reference for more information on changing theref
database log file size via thelog_file_size_ref_dbs
global variable. -
Each partition now requires two pre-allocated log files. The size of each log file is 256MB, for a total of 512MB of disk space for pre-allocated log space. See the engine variables reference for more information on changing the partition log file size via the
log_file_size_partitions
global variable. -
Logs will require (
snapshot_trigger_size
*snapshots_to_keep
) of space over time (approximately), or about 4GB per partition using the default values of these variables. Customers are recommended to have twice this amount of disk space available for long-term log storage. See the engine variables reference for more information onsnapshot_trigger_size
andsnapshots_to_keep
. -
WARNING: Changing the log file size will not affect existing databases. Reducing the size of existing log files or allocating new ones will have random and severe performance impacts on your workload when new log files need to be allocated.
-
If any host in the cluster is near or at disk capacity, please increase available storage before upgrading to MemSQL 7.0. Similarly, a newly deployed MemSQL 7.0 cluster will consume more disk space than a MemSQL 6.8 cluster running the same workload.
This topic describes how to upgrade MemSQL to 7.0. Please read the following information thoroughly before upgrading to MemSQL 7.0.
You can upgrade to this release directly from 6.5.x or later (e.g. 6.5.27) as well as from 7.0 Beta 3 (i.e. 7.0.6) and 7.0 RC (i.e. 7.0.7). If you want to upgrade to 7.0 from 6.0 or earlier, it is recommended that you upgrade to MemSQL 6.8 first before upgrading to 7.0.
After you have finished upgrading, please see the Post-Upgrade Considerations section for additional information on behavioral changes in MemSQL 7.0 that you should be aware of.
Important notes about upgrading
Upgrade duration and behavior
Anticipate a longer upgrade time for each node. At the end of the upgrade process, anticipate a multi-minute delay while final checks are run that ensure the upgrade completed successfully and the cluster is ready for operation. If you connect to the cluster while these post-upgrade checks are running, you will receive an error that the upgrade is still in progress.
Compound keys
In compound keys with the UNIQUE
constraint, if the type of the first column of the key is specified as NOT NULL
, that resulted in non-standard UNIQUE
constraint behavior for any NULL
-able columns in that same compound key. This behavior has been changed in MemSQL v7.0 to treat all columns the same, independent of position in the key, conforming to the SQL standard.
Delete performance
MemSQL 7.0 introduced row-level locking for columnstore update/delete so it can handle more concurrent queries, but for certain scenario it may cause performance degradation (delete on wide table with json column). Set COLUMNSTORE_TABLE_LOCK_THRESHOLD to bypass row-level locking and obtain the previous behavior.
Leaf nodes
In MemSQL 6.x, a leaf node can restart, come back online, and its databases can be queried when the Master Aggregator is offline. In MemSQL 7.0, a leaf node can restart and recover its databases, but it won’t bring them online unless the Master Aggregator is online. In all versions of MemSQL, leaf nodes can safely be restarted when the Master Aggregator is online.
Plancache
Plans in the plancache are dependent upon the specific MemSQL patch version, so when you upgrade to a new MemSQL version, all previously compiled plans will be invalidated. This means that any queries run against the upgraded cluster will force a one-time plan compilation, which results in slower query times the first time those queries are run. After the plans have been recompiled, they will be stored again in the plancache and query latency will return to nominal values.
Replication
Replication is now synchronous in MemSQL 7.0 by default. Fail-overs in MemSQL 7.0 will not occur unless a replica is in sync with its master. While you will never lose data on fail-overs, you could lose availability if your replicas are out of sync and a node goes down. Refer to Replication and Durability and Replay Configuration and Error Handling for more information.
SCHEMA_BINDING
for Views
When upgrading to MemSQL 7.0, SCHEMA_BINDING
for views is off by default. Prior to MemSQL 7.0, SCHEMA_BINDING
for views was on by default. As a result, an error will no longer be issued when deleting a table under a view.
Note that this behavior only applies to views that are created after upgrading, and does not apply to views that were created prior to upgrading.
Snapshots
The default value of snapshot_trigger_size
has been increased from 256MB to 2GB. This will not be updated automatically on upgrade, and it is recommended that customers make this change post-upgrade to reduce snapshot disk I/O.
Sync variables
A large percentage of system variables have been changed to “sync” variables. This allows variables to be set on the Master Aggregator, which will then persist and be replicated to the other nodes in the cluster. This obviates the need to manually set these variables in a .cnf
file. Use SHOW _SYNC VARIABLES
to see the current set of sync variables.
As a consequence of this, when a variable has been changed to a sync variable and has different values on two nodes (for example, a sync variable on Node X has been set to value A
, and this sync variable on Node Y has been set to value B
), the upgrade will issue a warning, and these variables must be reconciled before they can become sync variables.
To do so, manually update the .cnf
file and, using the above example, set the value of the sync variable to either A
or B
on both nodes. Next, run UPGRADE VARIABLES
on the Master Aggregator to enable all the new sync variables. Refer to Engine Variables and update-config for more information.
Verify your cluster is ready for upgrade
There must not be any disaster recovery (DR) databases present before upgrading to MemSQL 7.0. If there are any DR databases in the cluster, the database must be dropped before upgrading, and re-replicated after the upgrade. If you choose to keep a DR database, you must stop replicating to it before the upgrade. Note that this DR database cannot be used for replication after the upgrade.
Prior to upgrading your cluster, it is recommended that you take a backup as a standard precautionary measure. See Backing Up and Restoring Data.
In addition, from the master aggregator, run the following commands:
SHOW LEAVES;
SHOW AGGREGATORS;
SHOW CLUSTER STATUS;
EXPLAIN RESTORE REDUNDANCY;
EXPLAIN REBALANCE PARTITIONS;
With the output of these commands, confirm that the following are true:
-
All leaves are online
-
All aggregators are online
-
There are no partitions with ‘Orphan’ role
-
No Rebalance or restore redundancy is necessary
After you have backed up your data and verified your cluster is ready, you are ready to upgrade your cluster to the latest version of MemSQL using either the MemSQL management tools or MemSQL Ops.
You can use either SingleStore DB Tools or MemSQL Ops to upgrade from MemSQL 6.8 to MemSQL 7.0. If you are upgrading from MemSQL 6.5 or earlier, or have a cluster managed by MemSQL Ops, please see the instructions for using MemSQL Ops to perform the upgrade process.
Upgrade versions and methods
The tables below depict which versions of MemSQL can be upgraded to SingleStore DB 7.0 and the method by which the cluster can be upgraded.
-
Offline upgrade: Your cluster will be shut down and restarted over the course of the upgrade
-
Online upgrade: Your cluster will not be shut down over the course of the upgrade
Upgrade via SingleStore DB Toolbox
Upgrade from | Offline upgrade | Online upgrade |
---|---|---|
6.8 | ✔ | From 6.8.19+ |
6.7 | ✔ | From 6.7.27+ |
6.5 | ✘ | ✘ |
Upgrade via MemSQL Ops
Upgrade from | Offline upgrade | Online upgrade |
---|---|---|
6.8 | ✔ | ✘ |
6.7 | ✔ | ✘ |
6.5 | ✔ | ✘ |
Upgrade your cluster
Select an option below to upgrade your cluster.
Step 1: Upgrade SingleStore DB Tools
To upgrade to 7.0, you must have singlestoredb-toolbox
1.4.0 or later installed (1.4.4 or later for an online upgrade) prior to the MemSQL upgrade process. It is recommended that you use the latest version of singlestoredb-toolbox
when upgrading your cluster.
Run the install
command to either install singlestoredb-toolbox
if it is not already installed, or upgrade an existing version of singlestoredb-toolbox
to the latest version of the package.
RHEL/CentOS
sudo yum install singlestoredb-toolbox -y
Debian
sudo apt install singlestoredb-toolbox -y
Upgrade without Internet access
If your cluster does not have internet access, use one of the following buttons to download either the latest RPM or Debian singlestoredb-toolbox
package to a location accessible by your cluster:
Then, run the install
command for the installed package manager and specify the path to the package.
RHEL/CentOS
sudo yum install /path/to/singlestoredb-toolbox.rpm -y
Debian
sudo apt install /path/to/singlestoredb-toolbox.deb -y
Now you are ready to upgrade MemSQL.
Step 2: Upgrade MemSQL
The simplest and preferred upgrade option is an offline cluster upgrade. It is the least error-prone and easiest to execute; however, it requires downtime as all of the nodes in the cluster will be upgraded at the same time, shutting down the entire cluster for the duration of the upgrade.
If the cluster is running with High Availability, you also have the option to perform an incremental online cluster upgrade, which maintains cluster availability throughout the upgrade process. See the instructions in the Online upgrade section for more details.
Option 1: Offline upgrade
Note: Using this method to upgrade to MemSQL 7.0 is referred to as an “offline” upgrade as your MemSQL cluster will be shut down and restarted over the course of the upgrade. Do not shut down your cluster prior to starting the upgrade. If the cluster or individual nodes are offline when the upgrade is started, the upgrade will fail.
Execute the following command to start an offline upgrade.
-
Confirm that the cluster can be upgraded. The cluster will not be upgraded when running this command.
sdb-deploy upgrade --precheck-only
-
Upgrade your cluster.
Note:
sdb-deploy upgrade
will perform a snapshot of all databases prior to upgrade.sdb-deploy upgrade --version 7.0
If you do not specify a patch version, your cluster will be upgraded to the latest patch version of MemSQL 7.0.
During the upgrade process, you will be prompted to enable synchronous replication on your existing databases, or to leave those databases using the previous asynchronous replication behavior.
Synchronous replication is enabled by default on all new 7.0 databases and provides an extra layer of resiliency in clusters with high availability enabled.
Note: You cannot downgrade from your current version.
If your cluster does not have internet access, use one of the following buttons to download either the latest RPM or Debian memsql-server
package to a location accessible by your cluster:
The memsql-server
package contains both the MemSQL binary and the low-level management tool, memsqlctl.
Run the sdb-deploy upgrade
command and reference the appropriate package in the --file-path
argument. Running upgrade
(as opposed to simply upgrading the package via the package manager) will perform an offline restart of all the nodes to make sure the cluster is using the new version.
RHEL/CentOS
sdb-deploy upgrade --file-path /path/to/memsql-server.rpm
Debian
sdb-deploy upgrade --file-path /path/to/memsql-server.deb
For more information on the upgrade
command, see the SingleStore DB Tools reference documentation.
Once you have finished upgrading all of your nodes, see Post-upgrade considerations for any changes between versions that may impact your cluster.
Option 2: Online upgrade
Notes
-
This upgrade method is referred to as an “online” upgrade as your MemSQL cluster will not be shut down over the course of the upgrade. Nodes will be restarted in a specific sequence to ensure that DML-based workloads will still function. Do not shut down your cluster prior to starting the upgrade. If the cluster or individual nodes are offline when the upgrade is started, the upgrade will fail.
-
You may perform an online upgrade from MemSQL versions 6.7.29 and later, 6.8.19 and later, and 7.0.9 and later using
singlestoredb-toolbox
1.4.4 or later. -
Toolbox 1.5.0 cannot be used to perform an online upgrade from MemSQL 7.0.15 to later versions of MemSQL. Please follow the instructions in “Option 1: Offline upgrade” above.
An online upgrade may fail if a long-running workload that writes to the database is running on the target cluster. We recommend performing an online upgrade only after these workloads have completed.
Toolbox 1.11.7 and later provide the option to retry a failed online upgrade. Should the online upgrade ultimately fail, an offline upgrade will be attempted. Note that the cluster will be shut down and restarted over the course of an offline upgrade.
Execute the following command to start an online upgrade.
-
Confirm that the cluster can be upgraded. The cluster will not be upgraded when running this command.
sdb-deploy upgrade --precheck-only
-
Upgrade your cluster.
Note:
sdb-deploy upgrade
will perform a snapshot of all databases prior to upgrade.sdb-deploy upgrade --online --version 7.0
By specifying --version 7.0
, your cluster will be upgraded to the latest patch version of MemSQL 7.0.
During the upgrade process, you will be prompted to enable synchronous replication on your existing databases. While asynchronous replication can still be chosen, the data replication mechanism in MemSQL 7.0 has been enhanced to the point where synchronous replication is now the recommended choice.
Note: You cannot downgrade from your current version.
If your cluster does not have internet access, use one of the following buttons to download either the latest RPM or Debian memsql-server
package to a location accessible by your cluster:
The memsql-server
package contains both the MemSQL binary and the low-level management tool, memsqlctl.
Run the sdb-deploy upgrade --online
command and reference the appropriate package in the --file-path
argument. Running upgrade
(as opposed to simply upgrading the package via the package manager) will perform an offline restart of all the nodes to make sure the cluster is using the new version.
RHEL/CentOS
sdb-deploy upgrade --online --file-path /path/to/memsql-server.rpm
Debian
sdb-deploy upgrade --online --file-path /path/to/memsql-server.deb
For more information on the upgrade
command, see the SingleStore DB Tools reference documentation.
Once you have finished upgrading all of your nodes, see Post-upgrade considerations for any changes between versions that may impact your cluster.
For more information on the upgrade
command, see the SingleStore DB Tools reference documentation
Step 1: Upgrade MemSQL Ops
If you manage your cluster with MemSQL Ops, you must upgrade it to MemSQL Ops 7.0.0 or later before you can upgrade to MemSQL 7.0.
If you are upgrading from Ops 6.0 or later, run the following command to upgrade to the latest Ops 7.0 version.
memsql-ops agent-upgrade
Upgrade without Internet access
If your cluster does not have internet access, download the 7.0.0 version to your local machine. Then, run the agent-upgrade
command using the --file-path
argument.
sudo memsql-ops agent-upgrade --file-path /path/to/memsql-ops-XYZ.tar.gz
Now you are ready to upgrade MemSQL.
Step 2: Upgrade MemSQL
The currently supported upgrade option is an offline cluster upgrade. It requires downtime as all of the nodes in the cluster will be upgraded at the same time.
Offline upgrade
Note: Upgrading to MemSQL 7.0 is referred to as an “offline” upgrade as your MemSQL cluster will be shut down and restarted over the course of the upgrade. Do not shut down your cluster prior to starting the upgrade. If the cluster or individual nodes are offline when the upgrade is started, the upgrade will fail.
Run the following command to start an offline upgrade to the latest patch version of MemSQL 7.0.
memsql-ops memsql-upgrade
During the upgrade process, prompted to enable synchronous replication on your existing databases, or to leave those databases using the previous asynchronous replication behavior.
Synchronous replication is enabled by default on all new 7.0 databases and provides an extra layer of resiliency in clusters with high availability enabled.
Note: You cannot downgrade from your current version.
If your cluster does not have internet access, download the MemSQL binary from the following locations before running the memsql-upgrade
command with the --file-path
argument:
-
For the latest version, the download link is: http://download.memsql.com/releases/latest/memsqlbin_amd64.tar.gz
-
For any other version, the download link format is:
http://download.memsql.com/releases/version/X.Y.Z/memsqlbin_amd64.tar.gz
sudo memsql-ops memsql-upgrade --file-path /path/to/memsqlbin_amd64.tar.gz
For more information on the memsql-upgrade
command, see the CLI reference documentation.
Once you have finished upgrading all of your nodes, see Post-upgrade considerations for any changes between versions that may impact your cluster.
Post-upgrade considerations
When upgrading to MemSQL 7.0, you should be aware of the following changes to system behavior or default configuration settings. The behavior of a cluster upgraded from an earlier version to MemSQL 7.0 may differ compared to a newly installed cluster on MemSQL 7.0 as described below. Most of the changes fall into two categories:
-
In some versions, the default value for a configuration variable was changed compared to previous versions, but clusters upgraded from earlier versions retain their previous setting, both if it was set to a specific value or if it was not explicitly set and hence using the previous default. In some of these cases, it is recommended to update your configuration to the new default if you were previously using the old default, after appropriate testing.
-
Some new features are automatically enabled by default on newly installed MemSQL 7.0 clusters but not automatically enabled on clusters upgraded from an earlier version to 7.0. In some of these cases, it is recommended to enable the new features, after appropriate testing.
-
Changing the value of the
data_conversion_compatibility_level
sync variable can change the behavior of expressions in computed columns. Refer to the data conversion compatibility level section of Data Types for more information.
If you are upgrading past one or more major or minor versions, read all sections applicable to the version you are upgrading from. For example, if upgrading from 6.0 to 7.0, both the “Upgrades from 6.5 and earlier” and the “Upgrades from 6.0 and earlier” sections are applicable.
Upgrades from 5.8 and earlier to 6.0 and later
Automatic statistics
MemSQL 6.0 introduced a new automatic statistics feature for columnstore tables. By default, automatic statistics is enabled on tables created on version 6.0 or later, but remains disabled on existing tables created on version 5.8 or earlier prior to the upgrade. In most cases, we recommend enabling autostats on existing columnstore tables created prior to the upgrade using ALTER TABLE table_name ENABLE AUTOSTATS
. See autostats support upon upgrading MemSQL.
Increased columnstore_segment_rows
default
columnar_segment_rows
, which is now an alias for columnstore_segment_rows
in 6.0 or later, keeps the old default value of 102,400 after upgrading minimize the performance impact on any existing workloads; however, if 6.0 or later is installed on a new cluster, the default value will be 1,024,000. If you want to change the columnstore_segment_rows
value to the new default for 6.0 or later, set it as a global variable using the instructions here.
Changing the default columnstore_segment_rows
value may have significant performance impact on the workload. It is advisable to test on a staging environment before applying changes to the columnstore_segment_rows
setting. For more information about columnstore_segment_rows
, see Advanced Columnstore Configuration Options.
Upgrades from 6.0 and earlier to 6.5 and later
Advanced histograms and cardinality_estimation_level
MemSQL 6.5 introduced new, more advanced histograms, but old histograms from 6.0 and earlier are still supported. When upgrading from MemSQL 6.0 to 6.5 or later, there is a new engine variable introduced in 6.5, cardinality_estimation_level
, which allows you to choose whether to use newer, more advanced histograms and algorithms to perform cardinality estimation, or use the previous histogram format. The default setting on newly installed 6.7 or later clusters is 6.5
, which uses the new histograms. The default setting on clusters upgraded from 6.0 or earlier to 6.7 or later is 6.0
, which uses the old histograms to minimize performance impact on any existing workloads.
You can change the setting cardinality_estimation_level
by running SET GLOBAL cardinality_estimation_level = '6.5'
on the master aggregator. (Changes will be synchronized to the other nodes in the cluster, and restarting memsql nodes is not required.) After changing the setting, you should re-run ANALYZE TABLE on all previously analyzed tables to collect up-to-date statistics.
AUTO_INCREMENT behavior changes
When upgrading to MemSQL 6.5 or later, the behavior for AUTO_INCREMENT
has changed. Specifically, a reserved set of values are now managed by each aggregator, which are then encoded with the aggregator’s ID whenever the aggregator performs an insert into a table that uses AUTO_INCREMENT
. This range of values jumps each time the aggregator restarts. Having each aggregator manage its own set of AUTO_INCREMENT
values prevents collisions after the aggregator is restarted.
The first observable result of these changes is that after upgrading to 6.5 or later, the start value for the next insert post-upgrade will be much higher than the last value set before the upgrade to align with the new AUTO_INCREMENT
behavior. For more information on these changes, see AUTO_INCREMENT behavior in the 7.0 version of CREATE TABLE
.
Upgrades from 6.5 and earlier to 6.7 and later
Default DDL timeout update
In MemSQL 6.7, the default value of default_distributed_ddl_timeout
was changed from the 6.5 default value 18446744073709551615 to the 6.7 default value 180000. The new default applies to clusters newly installed on MemSQL 6.7 or later, but clusters upgraded from MemSQL 6.5 to 6.7 retain the previous value.
These values are in milliseconds. The MemSQL 6.5 default value is effectively infinite, so the DDL lock will never timeout. With the MemSQL 6.7 default value, the lock will timeout in three minutes.
After upgrading to MemSQL 6.7 or later, it is recommended that you set the default_distributed_ddl_timeout
value to 180000 to match the new default value. To update this variable, run SET GLOBAL default_distributed_ddl_timeout = default
on the master aggregator (the change is automatically synchronized to other nodes in the cluster).
Upgrades from 6.7 and earlier to 6.8 and later
interpreter_mode set to interpret_first by default
In previous versions of MemSQL, interpreter_mode
was set to llvm
, which meant queries were compiled first before execution. In MemSQL 6.5, a new interpreter_mode
, interpret_first
, was available as an experimental setting. This setting allowed you to start interpreting queries for execution while asynchronously compiling them in the background for later use. For complex queries, this approach can reduce overall query execution time during the first execution of the query. In MemSQL 6.8 or later, this mode is now production-ready and enabled by default.
Upgrades from 6.8 and earlier to 7.0 and later
Synchronous replication on by default
In previous versions of MemSQL, in clusters with high availability enabled, replication between master partitions and replica partitions happened asynchronously. Starting with 7.0, synchronous replication is enabled by default for all new databases. And as noted in the upgrade section of this topic, you have the option to opt-out of synchronous replication for the existing databases in your cluster. For more information on synchronous replication in 7.0, see Replication and Durability and CREATE DATABASE.
Security change for resource pools
Between MemSQL 6.5 to 6.8, all users were able to access all resource pools. Access had to be controlled externally from the database, such as in a client application. Starting in 7.0, users can only access resource pools that they have explicit permission to access when sync_permissions
is enabled. Otherwise, no permission checking is done for any user for any resource pool. Refer to Synchronizing Permissions Across Your Cluster for more information.
To ensure current users will be able to access pools immediately after upgrading to 7.0, all existing users will be granted USAGE
permissions to all existing and future resource pools if sync_permissions
was enabled prior to upgrade (i.e. for each user GRANT USAGE ON RESOURCE POOL '*' TO <user>@<host>
is run internally on upgrade to 7.0 or later) unless REVOKE USAGE ON RESOURCE POOL '*' FROM <user>@<host>
is run. You will then have to explicitly grant USAGE
permissions to specific resource pools for those users and any other new users created. Refer to Setting Resource Limits for more information.
Many existing engine variables are now sync variables
The following engine variables from 6.8 are now sync variables, which means you set the value on the master aggregator and the change is propagated across your cluster. You no longer have to set and manage these variables on each node in the cluster.
Global variables
- explain_expression_limit
- internal_columnstore_window_minimum_blob_size
- forward_aggregator_plan_hash
- read_advanced_counters
- geo_sphere_radius
- geo_query_info
- multi_insert_tuple_count
- load_data_read_size
- load_data_write_size
- load_data_max_buffer_size
- load_data_internal_compression
- columnstore_segment_rows
- columnstore_flush_bytes
- columnstore_disk_insert_threshold
- max_connect_errors
- enable_columnstore_ingest_management
- columnstore_ingest_management_queue_timeout
- plan_expiration_minutes
- disk_plan_expiration_minutes
- enable_disk_plan_expiration
- replication_timeout_ms
- snapshot_trigger_size
- sync_slave_timeout
- sync2_timeout
- synchronize_database_timeout
Session Variables
- ignore_insert_into_computed_column
- max_subselect_aggregator_rowcount
- leaf_pushdown_enable_rowcount
- reshuffle_group_by_base_cost
- max_broadcast_tree_rowcount
- enable_broadcast_left_join
- explain_joinplan_costs
- statistics_warnings
- optimizer_warnings
- optimize_constants
- optimize_huge_expressions
- optimize_expressions_larger_than
- optimize_stmt_threshold
- report_mpl_optimizations
- sql_select_limit
- lock_wait_timeout
- character_set_server
- collation_connection
- collation_database
- collation_server
- sampling_estimates_for_complex_filters
- leaf_pushdown_default
- enable_binary_protocol
- enable_multipartition_queries
- enable_local_shuffle_group_by
- enable_skiplist_sampling_for_selectivity
- inlist_precision_limit
See the List of Engine Variables for more information on these variables.