Outdated Version
You are viewing an older version of this section. View current production version.
check
Checks the provided report for issues.
Usage
Checks the provided report for issues.
Available Checkers:
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| ID | DESCRIPTION |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| attachRebalanceDelay | This variable should be set to 120 (default). If it is set to another value, the |
| | cluster may experience delays in self-healing operations |
| autoAttach | This variable should be set to "ON" (default). "OFF" value is preventing the |
| | nodes from reattaching after restart |
| blockedQueries | Blocked queries may lead to additional failed operations. We recommend that you |
| | reduce your workload or kill running queries |
| cgroupDisabled | Linux memory subsystems use a number of bytes of memory per physical page on |
| | x86_64 systems. These resources are consumed even when memory is not used in |
| | any hierarchy. As SingleStore DB doesn't use the memory subsystem, we recommend |
| | disabling this as it will reduce the resource consumption of the kernel |
| chronydDisabled | We recommend that chronyd is disabled so that ntpd can be used for time |
| | synchronization. Contact your administrator to disable chronyd |
| clusterMemoryUsage | As SingleStore DB is allocated the value specified in maximum_memory, query |
| | failures may result if memory usage approaches this limit. To alleviate this |
| | condition (for the short term), increase maximum_memory or to delete data which |
| | is being stored in memory to allow more headroom |
| collectionErrors | Collection errors in the report typically indicate that all parts of the report |
| | could not be gathered. This could mean that some information may be missing |
| | and a thorough check could not be performed, or that Toolbox cannot access the |
| | required information |
| columnstoreSegmentRows | Inconsistent columnstore segment rows can lead to non-optimal query performance or other issues. Columnstore segment |
| | rows refers to the number of rows SingleStore DB holds in each segment. The default value is 1024000. Refer to |
| | https://docs.singlestore.com/latest/key-concepts-and-features/physical-schema-design/columnstore/managing-columnstore-segments/ |
| | for more information |
| consistentMaxMemory | Inconsistent maximum memory settings will lead to some nodes having more or less memory available for operations and can |
| | cause performance inconsistencies across the cluster. We recommend that all max_memory settings are consistent. Refer to |
| | https://docs.singlestore.com/latest/guides/cluster-management/maintain-your-cluster/managing-memory/configuring-memory-limits/ |
| | for more information |
| cpuFeatures | SingleStore DB can make use of AVX2 instructions for optimal performance. Refer to |
| | https://docs.singlestore.com/latest/reference/configuration-reference/cluster-configuration-reference/instruction-set-verification/ |
| | for more information |
| cpuFreqPolicy | Disabling power saving and Turbo Mode settings on all hosts will lead to more consistent performance across the cluster |
| cpuHyperThreading | A CPU with hyperthreading will ensure optimal performance. Hyperthreading allows a CPU to split a physical core into two virtual |
| | cores, or "threads." This allows each core to do two things simultaneously |
| cpuIdle | In general, SingleStore recommends utilizing all of the cores available on a host. However, if a CPU is frequently less than 5% |
| | idle, this typically indicates that your workload will not have room to grow, and more cores are likely required |
| cpuMemoryBandwidth | Low CPU-memory bandwidth can highlight potential performance issues on your hosts |
| cpuModel | Differing CPU models may lead to inconsistent performance |
| defaultVariables | We recommend keeping the default values for these variables for optimal cluster operation |
| defaultWorkloadManagement | We recommend keeping the default values for the workload management settings for optimal cluster operation |
| defunctProcesses | Defunct processes may be using system resources and preventing their use by SingleStore DB. It is recommended that you kill these |
| | processes if possible |
| delayedThreadLaunches | Delayed thread launches may indicate that a workload is too intensive for the available threads. We recommend decreasing the |
| | cluster's workload |
| detectCrashStackTraces | The presence of dmp.stack files indicates that a SingleStore DB node has crashed, which should be investigated |
| disconnectedReplicationSlaves | Disconnected replication slaves may mean that you don't have full redundancy in your system |
| diskBandwidth | Disk bandwidth, an indicator of disk performance, is computed by examining the total bytes transferred between the first request |
| | for service and the completion of the transfer |
| diskInodesUsage | Exhausting the inode capacity can lead to the inability to store and/or retrieve data. To alleviate this potential issue, either |
| | increase the inode capacity, or reduce the inode usage |
| diskLatencyRead | Disk bandwidth is an important performance indicator when reading data. SingleStore recommends investigating potential disk |
| | performance issues when the disk's "read" latency is greater than 10 ms |
| diskLatencyWrite | Disk bandwidth is an important performance indicator when writing data. SingleStore recommends investigating potential disk |
| | performance issues when the disk's "write" latency is greater than 10 ms |
| diskUsage | Checks free disk space and identifies if you are approaching your disk capacity limits |
| duplicatePartitionDatabase | Duplicate partitions may cause extra memory or disk usage in your system |
| explainRebalancePartitionsChecker | If the cluster isn't properly rebalanced (where EXPLAIN REBALANCE PARTITIONS is not null), partitions are |
| | not distributed evenly across the cluster. An uneven partition distribution can lead to nodes containing |
| | more data and/or performing more work (leading to "hotspots"). To remedy, run REBALANCE PARTITIONS. Refer to |
| | https://docs.singlestore.com/latest/reference/sql-reference/cluster-management-commands/rebalance-partitions/ for more information |
| failedBackgroundThreadAllocations | Failed background thread allocations can lead to further cascading cluster issues. It is recommended you scale back your workload |
| | when you see these failures |
| failedCodegen | Code generation errors indicate that your SQL was not properly compiled. We recommend that you review and correct the query that |
| | caused the code generation error |
| failureDetectionOn | SingleStore DB nodes will not properly fail over if failure detection is set to OFF. To ensure that SingleStore DB nodes will |
| | properly fail over, set failure detection to ON |
| filesystemType | Unsupported file systems may cause unpredictable results. Please ensure your cluster is deployed on a |
| | supported filesystem. Refer to https://docs.singlestore.com/latest/reference/configuration-reference/ |
| | cluster-configuration-reference/system-requirements/#recommendations-for-optimal-on-premise-columnstore-performance for more |
| | information |
| installedPermissions | Specific file ownership permissions are required to run SingleStoreDB. This check ensures that the permissions are set properly so |
| | that SingleStore DB can operate without issue |
| interpreterMode | We recommend setting the interpreter mode to interpret_first. When set, SingleStore DB interprets and |
| | compiles a query shape in parallel as the query is encountered rather than compiling it first. Refer to |
| | https://docs.singlestore.com/latest/concepts/code-generation/#the-interpreter-modes-effects-on-code-generation for more information |
| kernelVersions | Inconsistent kernel versions are not recommended |
| leafAverageRoundtripLatency | If leafroundtrip latency is high, we recommend checking your network connectivity between hosts |
| leafPairs | This setting ensures that one leaf doesn't house both the replica and master partition for a given availability group, which allows |
| | the cluster to handle the leaf node's failure |
| leavesNotOnline | Offline leaf nodes may indicate a cluster issue. If high availability is not enabled, the databases will be inaccessible |
| longRunningQueries | Long-running queries may indicate that the cluster's workload is too high. We recommend checking the cluster's workload for |
| | long-running queries and killing them |
| majorPageFaults | Memory pressure is an indicator that a hosts's memory is unable to efficiently service processing needs. Frequent page faults on a |
| | host are a sign of memory pressure |
| mallocActiveMemory | Shows the memory allocated directly from the operating system and managed by the C runtime allocators (not SingleStore DB’s |
| | built-in memory allocators that use the Buffer Manager). In this case, the memory use should be approximately 1 - 2 GBs for most |
| | workloads. If larger, we recommend investigating the system's memory use |
| maxMapCount | Incorrectly setting this can lead to memory errors. Refer to |
| | https://docs.singlestore.com/latest/reference/configuration-reference/cluster-configuration-reference/system-requirements/#configure-linux-vm-settings |
| | for more information |
| maxMemorySettings | We recommend setting the maximum memory to a percentage of the host's total memory, with a ceiling of 90% |
| maxOpenFiles | A setting lower than the recommended setting can significantly degrade performance and introduce connection limit errors. Refer to |
| | https://docs.singlestore.com/latest/reference/configuration-reference/cluster-configuration-reference/system-requirements/#configure-linux-vm-settings |
| | for more information |
| memoryCommitted | Virtual memory can potentially be overallocated, and exceed a hosts's physical memory. This can lead to a workload failures due to memory pressure |
| memsqlVersions | We recommended that the deployed version of SingleStore DB is consistent across all hosts and nodes |
| minFreeKbytes | Setting these to the recommended values will minimize the likelihood of memory errors on your hosts. Refer to |
| | https://docs.singlestore.com/latest/reference/configuration-reference/cluster-configuration-reference/system-requirements/#configure-linux-vm-settings |
| | for more information |
| missingClusterDb | The cluster database holds all the metadata for your cluster. A missing cluster database requires intermediate intervention and potentialaly a refresh |
| | of your cluster via backup/restore |
| networkBuffersMax | wmem_max and rmem_max are network settings that control the send and receive socket buffer sizes, respectively. If these parameters are set too low, |
| | you may experience latency. It is recommended to set each of these values to a minimum of 8MB |
| numaConfiguration | When running SingleStore DB on hosts that support Non-Uniform Memory Access (NUMA) sockets, |
| | we recommend configuring SingleStore DB for NUMA via numactl for optimal performance. Refer to |
| | https://docs.singlestore.com/latest/reference/configuration-reference/cluster-configuration-reference/configure-numa/ for more information |
| offlineAggregators | Offline aggregators must be addressed as less work will be load-balanced across the cluster |
| orchestratorProcesses | Orchestrator processes may cause undesired actions to be taken on SingleStore DB hosts which may negatively impact the cluster |
| orphanDatabases | Orphan databases, while unused, still consume memory. Orphan databases can and should be cleared using CLEAR ORPHAN DATABASES. Refer to |
| | https://docs.singlestore.com/latest/reference/sql-reference/operational-commands/clear-orphan-databases/ for more information |
| orphanTables | Orphan tables, while unused, still consume memory. Orphan tables can and should be cleared using CLEAR ORPHAN DATABASES. Refer to |
| | https://docs.singlestore.com/latest/reference/sql-reference/operational-commands/clear-orphan-databases/ for more information |
| outOfMemory | Out-of-memory errors may indicate memory pressure on the cluster. We recommend identifying and reducing memory usage. Refer to |
| | https://docs.singlestore.com/latest/guides/cluster-management/troubleshooting/identifying-reducing-memory-usage/ for more information |
| partitionsConsistency | We recommend that SSD partitions start at a minimum of 4096 byte-sectors. Disk performance issues may result if this value is inconsistent across |
| | hosts, or if the partition starts at < 4096 byte-sectors |
| pendingDatabases | Pending databases are available for read and write queries. Databases that remains in a "pending" state for an extended period shoud be investigated |
| queuedQueries | A large number of queued queries may indicate a high cluster workload. We recommend reducing the workload and/or killing long-running queries |
| readyQueueSaturated | Ready Queue saturation indicates there aren't enough connection threads available to handle the workload. We recommend reducing the workload and/or |
| | killing long-running queries |
| replicationPausedDatabases | Identifies if PAUSE REPLICATION has been run and provides a status |
| runningAlterOrTruncate | A running ALTER or TRUNCATE command may explain why the cluster is experiencing issues when attempting to run queries |
| runningBackup | This informational check can help troubleshoot issues caused by running a backup |
| secondaryDatabases | This informational check can help determine if the cluster is the primary cluster, or a secondary/replicated one |
| swapEnabled | This check determines if there is adequate swap space on a host, where 10% or more of physical memory is typically allocated for swap. Swap space will |
| | be utilized when the host is under memory pressure |
| swapUsage | Your host may be under memory pressure if the swap space that is actively being used is greater than 5% |
| syncCnfVariables | If sync variables are not set in the engine, there will be discrepancies between what the cnf file contains and what the associated values actually |
| | are |
| tracelogOOD | Out of disk space |
| tracelogOOM | Out of memory |
| transparentHugepage | Disable transparent huge pages (THP) for optimal SingleStore DB performance. Refer to |
| | https://docs.singlestore.com/latest/reference/configuration-reference/cluster-configuration-reference/system-requirements/#disable-transparent-huge-pages |
| | for more information |
| unkillableQueries | Indicates that there are queries running on your cluster that can't be killed. This may be due to long-running processes that have rendered other |
| | processes to be unkillable. We recommend identifying long-running processes using SHOW PROCESSLIST and killing them |
| unmappedMasterPartitions | Use ATTACH PARTITIONS to reattach disconnected partitions to the cluster. Refer to |
| | https://docs.singlestore.com/latest/reference/sql-reference/attach-partition/ for more information |
| unrecoverableDatabases | An unrecoverable database is no longer readable or writeable |
| userDatabaseRedundancy | The absence of redundancy indicates that not all partitions have replicas that they can failover to. We recommend running EXPLAIN RESTORE REDUNDANCY |
| | and restoring if possible. Refer to https://docs.singlestore.com/latest/reference/sql-reference/cluster-management-commands/restore-redundancy/ for more |
| | information |
| validLicense | A valid and properly applied license is required to comply with SingleStore DB terms and conditions |
| validateSsd | SingleStore DB must be deployed and run on SSDs |
| versionHashes | Confirms that a SingleStore DB version is a General Availability (GA) release |
| vmOvercommit | By design, Linux kills processes that are consuming large amounts of memory when the amount of free memory is deemed to be too low. Overcommit settings |
| | that are set too low may cause frequent and unnecessary failures |
| vmSwappiness | The swapiness value (0 - 100) affects system performance as it controls when swapping is activated, and how swap space is used. When set to lower values, |
| | the kernel will use less swap space. When set to higher values, the kernel will use more swap space. Swapiness should never be set to 0 |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
Examples:
# Run a single checker
memsql-report check --only orchestratorProcesses
# Run pre-SingleStore DB install environment checks only. Use this command with memsql-report collect --validate-env
memsql-report check --validate-env
# Exclude specific checkers
memsql-report check --exclude minFreeKbytes --exclude maxOpenFiles
Usage:
memsql-report check [flags]
Flags:
--exclude VALUES Exclude the specified checkers
-h, --help Help for check
--include VALUES Include the specified checkers
--include-performance Include checkers that create load on cluster (not recommended for active clusters)
--only VALUES Only run the specified checkers
-i, --report-path ABSOLUTE_PATH Read the report from the specified tarball or directory. If you do not already have a report, run 'memsql-report collect' to generate one
--show-skips Display more information about skipped checks
--validate-env Run checkers that do not require SingleStore DB installation (performance checkers included)
Global Flags:
--backup-cache FILE_PATH File path for the backup cache
--cache-file FILE_PATH File path for the Toolbox node cache
-c, --config FILE_PATH Toolbox configuration file path
--disable-spinner Disable the progress spinner, which some terminal sessions/environments may have issues with
-j, --json Enable JSON output
--parallelism POSITIVE_INTEGER Maximum number of operations to run in parallel
--runtime-dir DIRECTORY_PATH Where to store Toolbox runtime data
--ssh-max-sessions POSITIVE_INTEGER Maximum number of SSH sessions to open per host, must be at least 3
--state-file FILE_PATH Toolbox state file path
-v, --verbosity count Increase logging verbosity: valid values are 1, 2, 3. Usage -v=count or --verbosity=count
-y, --yes Enable non-interactive mode and assume the user would like to move forward with the proposed actions by default
Remarks
This command is interactive unless you use either the --yes
or --json
flags to override interactive behavior.