You are viewing an older version of this section. View current production version.
System Requirements and Recommendations
This topic does not apply to SingleStore Managed Service.
The following are some requirements and recommendations you should follow when provisioning and setting up your host machines to optimize the performance of your cluster.
With the exception of the hardware and software requirements, all other settings are optional.
Universal requirements
Each SingleStore node requires a host machine with an x86_64 CPU with at least four CPU cores and eight GB of RAM available per node.
When provisioning your host machines, the minimum Linux kernel version required is 3.10 or later.
Our recommended platforms are the following:
- RHEL/CentOS 6 or 7 (version 7 is preferred)
- Debian 8 or 9 (version 9 is preferred)
For cloud deployments, all instances should be geographically deployed in a single region. Instance types that support Enhanced Networking should have it (or a similar feature) enabled.
Network settings
Note: Perform the following steps on each host in the cluster.
-
As
root,
display the currentsysctl
settings and review the values ofrmem_max
andwmem_max
.sysctl -a | grep mem_max
-
Confirm that the receive buffer size (
rmem_max
) is 8MB for all connection types. If not, add the following line to the/etc/sysctl.conf
file.net.core.rmem_max = 8388608
-
Confirm that the send buffer size (
wmem_max
) is 8MB for all connection types. If not, add the following line to the/etc/sysctl.conf
file.net.core.wmem_max = 8388608
-
Persist these updates across reboots.
sysctl -p /etc/sysctl.conf
-
At the next system boot, confirm that the above values have persisted.
Network ports
Depending on the host machine and its function in deployment, some or all of the following port settings should be enabled on machines in your cluster. These routing and firewall settings must be configured to:
- Allow database clients (e.g. your application) to connect to the aggregators
- Allow all nodes in the cluster to talk to each other over the SingleStore DB protocol (3306)
- Allow you to connect to management and monitoring tools
Protocol | Port | Direction | Description |
---|---|---|---|
TCP | 3306 | Inbound and Outbound | Default port used by SingleStore DB. Required on all nodes for intra-cluster communication. Also required on aggregators for client connections. |
TCP | 22 | Inbound and Outbound | For host machine access. Required between nodes in SingleStore tool deployment scenarios. Also useful for remote administration and troubleshooting on the main deployment machine. |
TCP | 443 | Outbound | To get public repo key for package verification. Required for nodes downloading SingleStore APT or YUM packages. |
TCP | 8080 | Inbound and Outbound | Default port for SingleStore DB Studio. (Only required for the host machine running Studio.) |
The service port values are configurable if the default values cannot be used in your deployment environment. For more information on how to change them, see the SingleStore DB configuration file, sdb-toolbox-config register-host command, and SingleStore DB Studio Installation Guide topics.
We also highly recommend configuring your firewall to prevent other hosts on the Internet from connecting to SingleStore DB.
Hardware recommendations
The following are additional hardware recommendations for optimal performance:
Component | Recommendation |
---|---|
CPU | 8 vCPU per host machine. |
Memory | At least 4GB per core, 32GB minimum per leaf node. |
Storage | Provide a storage system for each node with at least 3 times the capacity of main memory. SSD storage is recommended for columnstore workloads. |
Here are some considerations when deciding on your hardware:
-
SingleStore DB rowstore storage capacity is limited by the amount of RAM on the host machine. Increasing RAM increases the amount of available data storage.
-
It is strongly recommended to run leaf nodes on machines that have the same hardware and software specifications.
-
SingleStore DB is optimized for architectures supporting SSE4.2 and AVX2 instruction set extensions, but it will run successfully on x64 systems without these extensions. See our AVX2 verification topic for more information on how to verify if your system supports AVX2.
-
For concurrent loads on columnstore tables, SSD storage will improve performance significantly compared to HDD storage.
Enabling Cluster-on-Die (if supported)
If you are installing SingleStore DB natively and have access to the BIOS, you should enable Cluster-on-Die in the system BIOS for machines with Haswell-EP and later x86_64 CPUs. When enabled, this will result in multiple NUMA regions being exposed per processor. SingleStore DB can take advantage of NUMA nodes by binding specific SingleStore nodes to those NUMA nodes, which in turn will result in higher SingleStore DB performance.
Software recommendations
In addition to these basic OS requirements, it is helpful to configure the underlying Linux OS in the following areas to get the most performance using SingleStore DB.
These tuning instructions should be done on each host machine in your cluster.
Configure Linux vm
settings
SingleStore recommends letting first-party tools, such as sdb-admin
and memsqlctl
, configure your vm
settings to minimize the likelihood of getting memory errors on your host machines. The default values used by the tools are the following:
vm.max_map_count
is set to1000000000
vm.min_free_kbytes
is set to either 1% of system RAM or 4 GB, whichever is smaller
If the SingleStore Tools cannot set the values for you, you will get an error message stating what the value should be and how to set it. You can set the values manually using the /sbin/sysctl
command, as shown below.
sudo sysctl -w vm.max_map_count=1000000000
sudo sysctl -w vm.min_free_kbytes=658096
Enabling NUMA support
If the CPU(s) on your host machines supports Non-Uniform Memory Access (NUMA), SingleStore DB can take advantage of that and bind SingleStore nodes to NUMA nodes. Binding SingleStore nodes to NUMA nodes allows faster access to in-memory data since individual SingleStore nodes only access data that’s collocated with their corresponding CPU.
If you do not configure SingleStore DB this way, performance will be greatly degraded due to expensive cross-NUMA-node memory access. Configuring for NUMA should be done as part of the installation process; however, you can reconfigure your deployment later, if necessary.
SingleStore Tools can do the NUMA binding for you; however, you must have numactl
installed first. Perform the following steps on each host machine:
-
Log into each host and install the
numactl
package. For example, for Debian-based OSes:sudo apt-get install numactl
For Redhat/CentOS, run the following:
sudo yum install numactl
-
Check the number of NUMA nodes your machines by running
numactl --hardware
. For example:numactl --hardware available: 2 nodes (0-1)
The output shows that there are 2 NUMA nodes on this machine, numbered 0 and 1.
For additional information, see Configuring for Non-Uniform Memory Access.
Disable Transparent Huge Pages
Linux organizes RAM into pages that are usually 4KB in size. Using transparent huge pages (THP), Linux can instead use 2MB pages or larger. As a background process, THP transparently re-organizes memory used by a process inside the kernel by either merging small pages to huge pages or splitting few huge pages to small pages. This may block memory usage on the memory manager, which may span for a duration of few seconds, and prevent the process from accessing memory. Because SingleStore DB uses a lot of memory, we recommend you disable THP at boot time on all nodes (master aggregator, child aggregators, and leaves) in the cluster. THP lag may result in inconsistent query run times or high system CPU (also known as red CPU).
For information on how to disable THP, see the documentation for your operating system.
Install and run Network Time Protocol service
Install and run ntpd
to ensure that system time is in sync across all nodes in the cluster.
For Debian-based distributions like Ubuntu:
sudo apt-get install ntpd
For RedHat/CentOS distributions:
sudo yum install ntp
Recommendations for Optimal On-Premise Columnstore Performance
We support the EXT4 and XFS filesystems. Also, many improvements have been made recently in Linux for NVMe devices, so we recommend using a 3.0+ series kernel. For example, CentOS 7.2 uses the 3.10 kernel.
If you use NVMe drives, set the following parameters in Linux (make it permanent in /etc/rc.local
):
# Set ${DEVICE_NUMBER} for each device
echo 0 > /sys/block/nvme${DEVICE_NUMBER}n1/queue/add_random
echo 1 > /sys/block/nvme${DEVICE_NUMBER}n1/queue/rq_affinity
echo none > /sys/block/nvme${DEVICE_NUMBER}n1/queue/scheduler
echo 1023 > /sys/block/nvme${DEVICE_NUMBER}n1/queue/nr_requests
Increase File Descriptor and Maximum Process Limits
A cluster uses a substantial number of client and server connections between aggregators and leaves to run queries and cluster operations. We recommend setting the Linux file descriptor and maximum process limits to the values listed below to account for these connections. Failing to increase this limit can significantly degrade performance and even cause connection limit errors. The ulimit
settings can be configured in the /etc/security/limits.conf
file, or directly via shell commands.
Permanently increase the open files limit and the max user processes limit for the memsql
user by editing the /etc/security/limits.conf
file as the root
user and adding the following lines:
memsql soft NOFILE 1024000
memsql hard NOFILE 1024000
memsql soft nproc 128000
memsql hard nproc 128000
A SingleStore node must be restarted for the changed ulimit
settings to take effect.
The file-max
setting configures the maximum number of file handles (file descriptor limit) for the entire system. On the contrary, ulimit
settings are only enforced on a process level. Hence, the file-max
value must be higher than the NOFILE
setting. Increase the maximum number of file handles configured for the entire system in /proc/sys/fs/file-max
. To make the change permanent, append or modify the fs.file-max
line in the /etc/sysctl.conf
file.
Configure Linux ulimit
settings
Most Linux operating systems provide ways to control the usage of system resources such as threads, files and network at an individual user or process level. The per-user limitations for resources are called ulimits, and they prevent single users from consuming too much system resources. For optimal performance, SingleStore recommends setting ulimits to higher values than the default Linux settings. The ulimit
settings can be configured in the /etc/security/limits.conf
file, or directly via shell commands.
Configure Linux nice
setting
Given how the Linux kernel calculates the maximum nice
limit, we recommend that you modify the /etc/security/limits.conf
file and set the maximum nice
limit to -10
on each Linux host in the cluster. This will allow the SingleStore DB engine to run some threads at higher priority, such as the garbage collection threads.
To apply this new nice
limit, restart each SingleStore node in the cluster.
Alternatively, you may set the default nice
limit to -10
on each Linux host in the cluster prior to deploying SingleStore DB.
Create swap space
It is recommended that you create a swap partition (or swap file on a dedicated device) to serve as an emergency backing store for RAM. SingleStore DB makes extensive use of RAM (especially with rowstore tables), so it is important that the operating system does not immediately start killing processes if SingleStore DB runs out of memory. Because typical machines running SingleStore DB have a large amount of RAM (>32 GB/node), the swap space can be small (<10% of physical RAM).
For more information setting up and configuring swap space, please refer to your distribution’s documentation.
After enabling these settings, your machines will be configured for optimal performance when running one or more SingleStore nodes.