Replication and Durability use logs and snapshots that are replayed to restore a database’s state.
How Snapshots and Logs are Used
How Snapshots and Logs are Used with Durability
In-memory database updates you make using DDL and DML commands are written to logs on disk. When the size of the updates reaches snapshot_trigger_size
, a snapshot is taken and written to disk. A snapshot is a full backup of the database. Following the creation of a snapshot, subsequent DDL and DML in-memory updates are again written to the logs, until snapshot_trigger_size
is again reached.
Following a server restart, the latest snapshot and the logs containing the updates made after the snapshot are loaded from disk and replayed in memory.
How Snapshots and Logs are Used with Replication
With replication, database partitions are copied from a primary host to a secondary host.
When a replica is provisioned, it receives an initial snapshot from the master. The replica replays this snapshot.
Going forward, the replica receives and replays logs from the master. These logs contain the in-memory database updates (from DDL and DML commands) made on the master. When the size of the logs reaches snapshot_trigger_size
as set on the master, the replica takes a snapshot. A snapshot is a full backup of the database. Following the creation of a snapshot, subsequent DDL and DML in-memory updates on the master are again written to the logs that are sent to the replica, until snapshot_trigger_size
(as set on the master) is again reached.
Replay Configuration
You can tune the snapshot-trigger-size
and snapshots-to-keep
engine variables to make efficient use of the logs and snapshots.
A large snapshot-trigger-size
decreases the frequency that snapshots are taken. But a large snapshot-trigger-size
increases the time needed to replay the snapshot.
A large snapshots-to-keep
increases the number of snapshots available, and it increases the amount of space needed to store the snapshots and logs.
snapshots-to-keep
defaults to 2
.
The datadir
engine variable stores the location of the snapshots and logs.
Replay Error Handling
This section lists errors that can occur when MemSQL processes the logs and the snapshots. It also discusses how MemSQL addresses the errors.
CRC32 Instruction not Supported
If your system hardware does not support the CRC32 instruction, you will receive the following error.
Warning: SSE4.2 is not supported. Resorting to software CRC32C. MemSQL recovery and log writing performance will be negatively impacted.
This error will be commonly seen on older processors and some virtualized environments. In this instance, MemSQL will use a software implementation of CRC32; however, this will slow down reading and writing log files. We recommend that production deployments of MemSQL run on environments that support this instruction.
Data Corruption Found During Replay
During replay, if MemSQL encounters corrupted logs or snapshots, it puts the database in the unrecoverable
state. Such a database will usually auto-heal if in high availability, redundancy-2 mode and the corrupted logs or snapshots are on the replica partitions. During auto-healing, the primary host takes a snapshot and sends it to the replica. When auto-healing is complete, the secondary database will resume its operation in the replicating
state.