Outdated Version

You are viewing an older version of this section. View current production version.

Using Durability and Recovery min read


Although MemSQL’s primary storage is main memory, the server maintains a copy of the data on disk as well.

Durability

MemSQL runs with full durability enabled. Transactions are committed to the transaction log on disk and later compressed into full-database snapshots.

The snapshot and log files are stored in the datadir directory configured in memsql.cnf. When MemSQL is restarted, it will recover itself into the state it was in before the shutdown, by reading the snapshot and log files in datadir.

Configuring Durability

The transaction_buffer setting allows you to configure the maximum size of the in-memory, per-database buffer of database transactions. If this buffer fills up, write queries will block until some of the transactions in the in-memory buffer are committed to disk to make space for new transactions.

If transaction_buffer is set to 0, the database runs with disk-synchronous durability: every write transaction is committed to disk before the query is acknowledged.

You can audit the performance of the log by monitoring the Transaction_buffer_wait_time variable in show status extended. If you notice that this value is growing over time, then the hard disk is unable to keep up with the pace of write queries committed in-memory. To fix this, you can

  • Restart the server with a larger transaction_buffer. This will increase the size of the in-memory transaction buffer, effectively allowing more room for transactions to sit and wait while the hard drive catches up. The trade off is that the upper bound of data the server can lose in the event of an unexpected system failure is now greater, because more transactions can live in memory and not yet on disk.
  • Reconfigure the server to use a faster disk. For example, using an SSD instead of a spinning disk will dramatically improve durability write performance.

When the log reaches snapshot_trigger_size, the database kicks off a new snapshot, which is a full backup of the database.

The log files and last snapshots_to_keep snapshots are kept by MemSQL in the data directory. Old files are automatically deleted after they have rotated out of the snapshots_to_keep window. MemSQL waits for the current snapshot to be completely written to disk before deleting older snapshots. This means enough disk space for snapshots_to_keep + 1 snapshots is needed to run MemSQL (+1 for the snapshot currently being written). By default snapshots_to_keep is set to 2.

Disabling Durability

It is not possible to disable durability on a MemSQL cluster. MemSQL relies on durability to replicate data across the cluster.

Snapshot and Log File Formats

MemSQL’s snapshots and logs use an internal binary format. Because MemSQL reconstructs indexes while recovering or replicating data, these files only store row data, not indexes. Database snapshots are generally smaller than their corresponding in-memory footprint because they do not store indexes. Both snapshots and log files contain checksums to confirm data integrity. The checksums are computed with the CRC32 instruction.

Warning

If you see the following warning message

Warning: SSE4.2 is not supported. Resorting to software CRC32C. MemSQL recovery and log writing performance will be negatively impacted.

your system does not support the CRC32 instruction (part of Intel’s SSE4.2 instruction set). This is common on older processors and some virtualized environments. MemSQL can use a software implementation of CRC32; however, this will slow down reading and writing log files in MemSQL. We recommend that production deployments of MemSQL run on environments that support this instruction.

Recovering Data

When MemSQL starts, it asynchronously loads data into memory from snapshots and logs. A database that is in this loading state is referred to as recovering. Connections to a database that is recovering are not allowed.

Recovery Status

Run SHOW DATABASES with the EXTENDED option to see which databases are recovering and to get an estimate for how long recovery will take. For more information, see Database States.

Recovery Errors

MemSQL uses checksums in both the snapshot and log files to confirm data integrity. Snapshot and log recovery behave differently if an inconsistency between the data and a checksum is detected.

  • If while recovering from a log file, MemSQL discovers an inconsistency between the data and a checksum, the database temporarily enters the offline state (see Database States). No connections to the database are allowed when it’s in the offline state. The log corruption is then automatically repaired by truncating the log and loading as much data from the log as possible into memory, and changing database state back to online. Any data after the corruption in the log won’t be recovered as MemSQL can’t guarantee the consistency of this data, because if a corrupted log record manipulates the same data as a later log record, replaying the log past the corruption can result in wrong results.
  • If MemSQL discovers an inconsistency between the data and a checksum while replaying the snapshot the database enters the unrecoverable state. There is no way to repair a database in the unrecoverable state. The database must be recovered from a backup using RESTORE.

Related Topics