Outdated Version
You are viewing an older version of this section. View current production version.
calibrate
Run ‘calibrate’ to measure cluster performance.
Usage
Run 'calibrate' to measure cluster performance.
A 'calibrate' database is created with rowstore and columnstore tables that are populated with data. A number of queries are run against this data to measure cluster performance.
The 'calibrate' tool requires a dataset, which will be loaded into tables. By default, the dataset is downloaded from the web, unpacked, and deleted at the end of the run. However, if your cluster cannot access the internet or already has the dataset, use the '--data-path' flag, which is an absolute path to the dataset on the Master Aggregator node.
You can download the dataset here: https://s3.amazonaws.com/calibrate.singlestore.com/calibrate-dataset.tar.gz
The dataset download uses 1.5 Gb of disk space, and the unpacking of data takes up another 12 Gb.
The '--partition-ratio' flag is used to test clusters under different concurrency conditions, such as, four CPU cores per partition. This flag specifies a proportion of CPU cores to Partitions and defaults to 2:1.
Workload consists of the following steps:
- Download the dataset
- Create the 'calibrate' database and set up the required variables
- Load data into tables
- Run calibration queries
- Retrieve results of the run and save them in a file on the Master Aggregator node
Workload duration divided into three main categories:
- FAST: under 12 minutes
- INTERMEDIATE: between 12 and 16 minutes
- SLOW: over 16 minutes
In case the cluster gets 'SLOW' or 'INTERMEDIATE' result, the tool will print data about the run, in addition to the file version, into the terminal.
Examples:
# A simple example that runs 'calibrate' workload with the root user of the cluster and downloads dataset from the internet
sdb-report calibrate
# Runs 'calibrate' workload with the user 'user' and password 'pass' with a dataset on path '/home/admin/dataset.tar.gz' on the Master Aggregator node
sdb-report calibrate -u usr -p pass --data-path /home/admin/dataset.tar.gz
# Run calibrate to test non-default partition ratio and retain calibrate database after the run
sdb-report calibrate --partition-ratio 4:1 --retain-database
Execution time of the following queries will be measured:
Load data queries:
+--------------+------------------------------------------------------------------------------------------------------+
| NAME | SQL |
+--------------+------------------------------------------------------------------------------------------------------+
| Load Queries | LOAD /*! calibrate_load_rs_1 */ DATA INFILE '{{CALIBRATE_LOAD_RS_1}}' INTO TABLE disttable2 FIELDS |
| | TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_cs_2 */ DATA INFILE '{{CALIBRATE_LOAD_CS_2}}' INTO TABLE disttable2_cs |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_rs_3 */ DATA INFILE '{{CALIBRATE_LOAD_RS_3}}' INTO TABLE foreignstring440k |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_cs_4 */ DATA INFILE '{{CALIBRATE_LOAD_CS_4}}' INTO TABLE |
| | foreignstring440k_cs FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_rs_5 */ DATA INFILE '{{CALIBRATE_LOAD_RS_5}}' INTO TABLE primarystring2 |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_cs_6 */ DATA INFILE '{{CALIBRATE_LOAD_CS_6}}' INTO TABLE primarystring2_cs |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
+--------------+------------------------------------------------------------------------------------------------------+
Calibration queries:
+--------------------------------+------------------------------------------------------------------------------------------------------+
| NAME | SQL |
+--------------------------------+------------------------------------------------------------------------------------------------------+
| Reshuffle Joins | SELECT /*! calibrate_reshuff_rs_q1 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_reshuff_rs_cs_q2 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | primarystring2_cs b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_reshuff_rs_cs_q3 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_reshuff_cs_q4 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN primarystring2_cs b ON a.a=b.a) A; |
| | |
| Broadcast Joins | SELECT /*! calibrate_bcast_rs_q1 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | WITH(broadcast_left=true) primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_bcast_rs_cs_q2 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | WITH(broadcast_left=true) primarystring2_cs b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_bcast_rs_cs_q3 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN WITH(broadcast_left=true) primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_bcast_cs_q4 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN WITH(broadcast_left=true) primarystring2_cs b ON a.a=b.a) A; |
| | |
| Local Group By | SELECT /*! calibrate_l_gby_rs_q1 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k.a),primarystring2.a FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k.a GROUP BY |
| | primarystring2.a) A; |
| | |
| | SELECT /*! calibrate_l_gby_rs_cs_q2 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2.a FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k_cs.a GROUP |
| | BY primarystring2.a) A; |
| | |
| | SELECT /*! calibrate_l_gby_rs_cs_q3 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k.a),primarystring2_cs.a FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k.a |
| | GROUP BY primarystring2_cs.a) A; |
| | |
| | SELECT /*! calibrate_l_gby_cs_q4 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2_cs.a FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k_cs.a |
| | GROUP BY primarystring2_cs.a) A; |
| | |
| Local Joins | SELECT /*! calibrate_lj_rs_q1 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k.a; |
| | |
| | SELECT /*! calibrate_lj_rs_cs_q2 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k_cs.a; |
| | |
| | SELECT /*! calibrate_lj_rs_cs_q3 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k.a; |
| | |
| | SELECT /*! calibrate_lj_cs_q4 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k_cs.a; |
| | |
| Single Part Join and Filter | SELECT /*! calibrate_spj_rs_q1 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | ON primarystring2.a=foreignstring440k.a AND primarystring2.a = 'A1000027'; |
| | |
| | SELECT /*! calibrate_spj_rs_cs_q2 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 ON primarystring2.a=foreignstring440k_cs.a AND primarystring2.a = 'A1000027'; |
| | |
| | SELECT /*! calibrate_spj_rs_cs_q3 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs ON primarystring2_cs.a=foreignstring440k.a AND primarystring2_cs.a = 'A1000027'; |
| | |
| | SELECT /*! calibrate_spj_cs_q4 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs ON primarystring2_cs.a=foreignstring440k_cs.a AND primarystring2_cs.a = |
| | 'A1000027'; |
| | |
| Reshuffle and Distributed | SELECT /*! calibrate_reshuff_dgby_rs_q1 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| Group By | COUNT(b.a) FROM disttable2 a STRAIGHT_JOIN primarystring2 b ON a.a=b.a GROUP BY a.b) A; |
| | |
| | SELECT /*! calibrate_reshuff_dgby_rs_cs_q2 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(b.a) FROM disttable2_cs a STRAIGHT_JOIN primarystring2 b ON a.a=b.a GROUP BY a.b) A; |
| | |
| | SELECT /*! calibrate_reshuff_dgby_rs_cs_q3 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(b.a) FROM disttable2 a STRAIGHT_JOIN primarystring2_cs b ON a.a=b.a GROUP BY a.b) A; |
| | |
| | SELECT /*! calibrate_reshuff_dgby_cs_q4 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(b.a) FROM disttable2_cs a STRAIGHT_JOIN primarystring2_cs b ON a.a=b.a GROUP BY a.b) A; |
| | |
| Local Join and Distributed | SELECT /*! calibrate_lj_dgby_rs_q1 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| Group By | COUNT(foreignstring440k.a),primarystring2.id FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k.a GROUP BY |
| | primarystring2.id) A; |
| | |
| | SELECT /*! calibrate_lj_dgby_rs_cs_q2 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2.id FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k_cs.a GROUP |
| | BY primarystring2.id) A; |
| | |
| | SELECT /*! calibrate_lj_dgby_rs_cs_q3 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k.a),primarystring2_cs.id FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k.a |
| | GROUP BY primarystring2_cs.id) A; |
| | |
| | SELECT /*! calibrate_lj_dgby_cs_q4 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2_cs.id FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k_cs.a |
| | GROUP BY primarystring2_cs.id) A; |
+--------------------------------+------------------------------------------------------------------------------------------------------+
Usage:
sdb-report calibrate [flags]
Flags:
--data-path string The absolute path to the folder on the Master Aggregator that contains the calibration datasets
-h, --help Help for calibrate
--partition-ratio RATIO The ratio of CPU cores to database partitions (CPU cores:database partitions) where both values must be less than or equal to 16 (default 2:1)
-p, --password STRING The database user's password for connecting to SingleStoreDB. If a password is specified on the command line, it must not contain an unescaped '$' character as it will be replaced by the shell
--retain-database Retain the ‘calibrate’ database after the calibration process completes
--temp-dir ABSOLUTE_PATH The directory on the Master Aggregator in which to unpack the dataset (ADVANCED)
-u, --user string The database user for connecting to SingleStoreDB (default "root")
Global Flags:
--backup-cache FILE_PATH File path for the backup cache
--cache-file FILE_PATH File path for the Toolbox node cache
-c, --config FILE_PATH File path for the Toolbox configuration
--disable-colors Disable color output in console, which some terminal sessions/environments may have difficulty with
--disable-spinner Disable the progress spinner, which some terminal sessions/environments may have issues with
-j, --json Enable JSON output
--parallelism POSITIVE_INTEGER Maximum number of operations to run in parallel
--runtime-dir DIRECTORY_PATH Where to store Toolbox runtime data
--ssh-control-persist SECONDS Enable SSH ControlPersist and set it to the specified duration in seconds
--ssh-max-sessions POSITIVE_INTEGER Maximum number of SSH sessions to open per host, must be at least 3
--ssh-strict-host-key-checking Enable strict host key checking for SSH connections
--ssh-user-known-hosts-file FILE_PATH Path to the user known_hosts file for SSH connections. If not set, /dev/null will be used
--state-file FILE_PATH Toolbox state file path
-v, --verbosity count Increase logging verbosity: valid values are 1, 2, 3. Usage -v=count or --verbosity=count
-y, --yes Enable non-interactive mode and assume the user would like to move forward with the proposed actions by default
Remarks
This command is interactive unless you use either the --yes
or --json
flags to override interactive behavior.