MemSQL is a distributed system. Because of this, all machines running MemSQL should be monitored to ensure smooth operation. Cluster status can be visualized through MemSQL Ops but it is also possible to programmatically query MemSQL nodes to get the status of the cluster. This section shows how to programmatically monitor a MemSQL cluster.
Monitor all MemSQL nodes
Similar to heartbeats sent by intra-cluster communication, all MemSQL nodes should be pinged with:
It is recommended to do this every minute.
Monitor the Master Aggregator
Extra monitoring should be placed on the master aggregator as it is the central coordinator of all activities within MemSQL. Run the following commands against the master aggregator:
show aggregators; show leaves; show cluster status;
The results of the commands above should be compared with previous executions, and any detected changes should trigger a notification.
Monitor OS resources
If you are using third party monitoring tools, make sure to monitor the following resources within each machine of the MemSQL cluster:
- CPU Usage
- CPU Load Average
- Memory Usage
- Memory Paging (page ins, page outs)
- Disk Utilization
- Disk Queue Time
- Network Usage
- Dropped packets / TCP retransmits
Paging refers to a technique that Linux and other operating systems use to deal with high memory usage. If your system is consistently paging, you should add more memory capacity or you will experience severely performance degradation.
When the operating system predicts that it will require more memory than it has physically available, it will move infrequently accessed pages of memory out of RAM and onto the disk to make room for more frequently accessed memory. When this memory is used later by a process, the process must wait for the page to be read off disk and into RAM. If memory used by MemSQL is moved to disk, the latency of queries that access that memory will be substantially increased.
You can measure paging on the command line by using the Linux tool by running the command
vmstat 1 and looking at the
swap section (
so refer to paging memory off the disk and into RAM and out of RAM and onto disk, respectively)