Outdated Version

You are viewing an older version of this section. View current production version.

Query Errors

ISSUE: Inconsistent Query Run Times

If you receive very different response times when you run the same query (or groups of queries on the same leaf) repeatedly, Transparent Huge Pages (THP) may be the cause.

Linux organizes RAM into pages that are ordinarily 4KB. With THP, Linux can instead use 2MB pages or larger. But THP transparently re-organizes memory used by a process inside the kernel, merging some small pages to huge pages, and splitting some huge pages to small pages. This background process holds locks on the memory manager, and can prevent your process from accessing memory.

You can check for THP by running cat /sys/kernel/mm/*transparent_hugepage/enabled. If the [brackets] are around [always], you should disable THP on all nodes in your cluster. Fully remedying the issue may require restarting MemSQL, or fully restarting all of the servers. Cluster operators should disable THP before launching in production.

Note: Having THP enabled may also be the cause of high system CPU (sometimes called red CPU).

How to Disable THP

MemSQL recommends disabling transparent huge pages (THP) at boot time on all nodes (master aggregator, child aggregators, and leaves) in the cluster. Not doing so may result in inconsistent query run times.

To disable THP, add the following lines to the end of rc.local before the exit line (if present), and reboot the host machines.

For Debian-based distributions, add the following to /etc/rc.d/rc.local:

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
echo no > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag

For RedHat/CentOS distributions, add the following to /etc/rc.local:

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
echo 0 > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
echo no > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
Info

The khugepaged/defrag option will be 1 or 0 on newer Linux versions (e.g. CentOS 7+), but yes or no on older versions (e.g. CentOS 6). You can check which your system uses by running cat /sys/kernel/mm/*transparent_hugepage/khugepaged/defrag and keep only the matching setting. For example, if you see 1 or 0 update the line with echo 0. If you see yes or no, update the line with echo no.

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Rowlock timeout errors can occur in an application because of contention with transactions or uncommitted transactions. Refer to the following sections to resolve lock wait timeout errors.

Issue

Updates and deletes in SingleStore are row locking operations. If a row is currently locked by query q1 running in transaction t1, a second query q2 in transaction t2 that operates on the same row will be blocked until q1 completes.

The “Lock wait timeout exceeded; try restarting transaction” error will occur when a query cannot proceed because it is blocked by a rowlock.

Deadlock between Transactions

Typically, a deadlock happens when two or more transactions are writing to the same rows, but in a different order. For example, consider two concurrently running queries q1 and q2 in different transactions, where both q1 and q2 want to write to rows r1 and r2. If the query q1 wants to write to rows r1 and then r2, but the query q2 wants to write to row r2 first and then r1, they will deadlock.

A single transaction that runs multiple statements should not deadlock itself, because queries are executed in the order they are defined.

Solutions

  • To resolve contention when two or more transactions are writing to the same row (in the same order), add indexes on the columns being updated. Adding indexes to the columns ensures that the row locks are taken in the same order that the write statements are specified in each transaction. Note: This solution will not resolve a deadlock (defined previously).

    For example, if the following query generates a lock wait timeout exceeded error:

    DELETE FROM bucket WHERE meta_uid_id = ?
    

    Add an index on meta_uid_id.

    Consider the following INSERT ... ON DUPLICATE KEY UPDATE statement:

    INSERT INTO allotments (DevID, DevName, Orders)
      SELECT * FROM deprAllot
      ON DUPLICATE KEY UPDATE
      DevID = VALUES(DevID),
      Orders =  VALUES (Orders);
    

    To reduce the chance of a lock wait timeout exceeded error, add individual indexes to the DevID and Orders columns.

  • You can decrease the lock_wait_timeout value to more quickly fail a query that is blocked by a lock.

  • To resolve a deadlock (defined above) you should change the flow of your application logic execute queries that write to the same rows in the same order.

  • You can also disable multistatement_transactions.

    Note: Disabling multistatement_transactions to prevent deadlocks is not recommended because it can lead to data loss or other serious issues.

    show variables like '%multistatement_transactions%';
    

    Shows whether multistatement transactions are on, and

    SET GLOBAL multistatement_transactions = off;
    

    turns it off.

    Disabling multistatement_transactions turns off cross-node transactions for write queries. Hence, write queries that span across multiple nodes, commit locally when the processing finishes on the node.

    To persist disabling multi-statement transactions through node restart, use either the update-config or memsql-update-config command on every aggregator node as shown below:

    MemSQL Tools

    memsql-admin update-config --key multistatement_transactions --value off --set-global --memsql-id <aggregator_id>
    

    MemSQL Ops

    memsql-ops memsql-update-config --key multistatement_transactions --value off --set-global <aggregator_id>
    

    By turning off multi-statement transactions, write queries that touch multiple nodes will commit locally when the processing finishes on the node. For example, if node 1 hits a duplicate key error, node 2 will still commit its transaction.

Issue

Open transactions hold the locks on rows affected by the transaction until they are committed or rolled back, and any other write query modifying the same rows has to wait for the open transaction to release the locks. If the query has to wait for more than the lock_wait_timeout (default 60 seconds), it fails. This happens most often when the open transaction is idle and unnecessarily holding the locks.

Solutions

  • To resolve the issue, identify the idle transaction and kill its connection.
  • To prevent the issue from recurring, ensure that all the transactions are committed or rolled back.

Resolving the current locking issue

If a query has failed because it waited long enough to exceed the lock_wait_timeout, identify the transaction that is causing the timeout, and kill its connection. Killing the connection rolls back the uncommitted writes of the open transaction. For example, a write operation is performed in the following transaction, but it is not committed.

START TRANSACTION;

UPDATE cust SET ORDERS = (ORDERS + 1) WHERE ID > 7680;

In another connection start a transaction, and run a query that tries to update the same set of rows as the transaction above.

START TRANSACTION;

UPDATE cust SET ORDER_DATE = DATE_ADD(ORDER_DATE, INTERVAL 1 DAY) WHERE ID > 7680;
****
ERROR 1205 (HY000): Leaf Error (192.168.3.152:3307): Lock wait timeout exceeded; try restarting transaction.  Lock owned by connection id 76, query `open idle transaction`

To get the list of all the running transactions, query the INFORMATION_SCHEMA.PROCESSLIST table. This management view table contains the processlist information for all nodes in the cluster, and you can join this table with the INFORMATION_SCHEMA.MV_NODES table to identify the nodes on which each process is running.

SELECT TYPE, IP_ADDR, PORT, mvp.ID, USER, HOST, COMMAND, TIME, TRANSACTION_STATE, ROW_LOCKS_HELD
FROM INFORMATION_SCHEMA.MV_PROCESSLIST mvp JOIN INFORMATION_SCHEMA.MV_NODES mvn
ON mvp.NODE_ID = mvn.ID
WHERE TRANSACTION_STATE = "open" AND COMMAND = "Sleep" AND TIME > 60
ORDER BY TYPE;
****
+------+---------------+------+-----+-------------+---------------------+---------+------+-------------------+----------------+
| TYPE | IP_ADDR       | PORT | ID  | USER        | HOST                | COMMAND | TIME | TRANSACTION_STATE | ROW_LOCKS_HELD |
+------+---------------+------+-----+-------------+---------------------+---------+------+-------------------+----------------+
| CA   | 192.168.1.146 | 3306 | 143 | root        | localhost:58322     | Sleep   |  385 | open              |              0 |
| LEAF | 192.168.3.152 | 3307 |  76 | distributed | 192.168.1.146:9555  | Sleep   |  385 | open              |           5168 |
| LEAF | 192.168.0.24  | 3307 | 125 | distributed | 192.168.1.146:7371  | Sleep   |  385 | open              |           5047 |
| LEAF | 192.168.0.24  | 3307 | 104 | distributed | 192.168.1.146:60874 | Sleep   |  385 | open              |           5001 |
| LEAF | 192.168.0.24  | 3307 | 115 | distributed | 192.168.1.146:63178 | Sleep   |  385 | open              |           5044 |
| LEAF | 192.168.3.242 | 3307 |  55 | distributed | 192.168.1.146:10118 | Sleep   |  385 | open              |           5129 |
| LEAF | 192.168.3.242 | 3307 |  77 | distributed | 192.168.1.146:21382 | Sleep   |  385 | open              |           5168 |
| LEAF | 192.168.3.242 | 3307 |  81 | distributed | 192.168.1.146:26758 | Sleep   |  385 | open              |           4998 |
| LEAF | 192.168.3.242 | 3307 | 122 | distributed | 192.168.1.146:55942 | Sleep   |  385 | open              |           5021 |
| LEAF | 192.168.0.24  | 3307 |  75 | distributed | 192.168.1.146:40650 | Sleep   |  385 | open              |           5015 |
| LEAF | 192.168.1.100 | 3307 | 131 | distributed | 192.168.1.146:23998 | Sleep   |  385 | open              |           5013 |
| LEAF | 192.168.3.152 | 3307 |  78 | distributed | 192.168.1.146:11603 | Sleep   |  385 | open              |           5124 |
| LEAF | 192.168.3.152 | 3307 |  83 | distributed | 192.168.1.146:21331 | Sleep   |  385 | open              |           5018 |
| LEAF | 192.168.3.152 | 3307 | 120 | distributed | 192.168.1.146:39763 | Sleep   |  385 | open              |           5092 |
| LEAF | 192.168.1.100 | 3307 |  79 | distributed | 192.168.1.146:1726  | Sleep   |  385 | open              |           5090 |
| LEAF | 192.168.1.100 | 3307 |  82 | distributed | 192.168.1.146:6078  | Sleep   |  385 | open              |           5065 |
| LEAF | 192.168.1.100 | 3307 | 114 | distributed | 192.168.1.146:21438 | Sleep   |  385 | open              |           5160 |
| MA   | 192.168.0.231 | 3306 | 490 | root        | localhost:32737     | Sleep   |  73  | open              |              0 |
+------+---------------+------+-----+-------------+---------------------+---------+------+-------------------+----------------+

In the result, each aggregator and each partition on each leaf has one idle open transaction which is holding row locks. The locks are only held on the leaves, because that is where the data is stored. Look for an aggregator process with the same run time as the leaf processes that are holding locks. In the result above, it is the process with connection ID 143, which is running on the child aggregator on host 192.168.1.146:3306. To resolve the current issue, connect to this aggregator and kill the connection with ID 143.

KILL CONNECTION 143;

Alternatively, you can kill the open connections that have been idle for more than lock_wait_timeout.

Note: Although the error message indicates that the lock is owned by the connection with ID 76, it is not the connection that needs to be killed. It is the connection for one of the distributed leaf processes, which holds the lock that our query was waiting on. Instead, we need to kill the connection for the aggregator process, which will roll back the entire transaction.

The original connection on the child aggregator that was updating ORDERS is now closed, and the transaction is rolled back. If you try to use that session again, it automatically re-establishes a new connection with a new ID.

SELECT COUNT(*) FROM cust;
****
ERROR 2013 (HY000): Lost connection to MySQL server during query
SELECT COUNT(*) FROM cust;
****
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    160
Current database: db1

+----------+
| COUNT(*) |
+----------+
|   390955 |
+----------+

Preventing the Issue from Recurring

To prevent the issue from recurring, inspect the application code and ensure that all the transactions are either committed or rolled back, including the transactions in exception handling. Additionally, all manually run transactions must be committed or rolled back as well.

A common cause of this error is if the transaction is executed as part of a stored procedure, but exception handling causes the final commit to get skipped. To prevent this issue, add rollbacks for the transaction in the exception handling section of the stored procedure. For more information, see Transactions in Stored Procedures.

The impact of this error can be mitigated by setting a session timeout for the client, in order to automatically kill the idle transactions. When a client ends a session, the database rolls back the uncommitted transactions and closes the connection on its end. If the session timeout value is set lower than the lock wait timeout (default 60 seconds), then idle connections will not cause lock wait timeouts to be breached.

Note: Setting a query timeout in Resource Governance will not affect idle transactions. The query timeout limit only applies to executing queries.

ERROR 1706 (HY000): Feature ‘Multi-table UPDATE/DELETE with a reference table as target table’ is not supported by MemSQL

Issue

Updating a reference table based on a join with a sharded table returns an error. This is because reference tables are only updatable on the master aggregator while joins against sharded tables must run on the leaves, so the query cannot run like normal updates.

Solutions

  • Use the same table types that use both the target as well as joined table as sharded tables or as reference tables.

  • Use UPSERT with a reference table that has a primary key. For example, if t1 is a reference table with a primary key, then the following query runs successfully:

    INSERT INTO t1 SELECT t1.* FROM t1 JOIN t2 ON t1.b = t2.b ON DUPLICATE KEY UPDATE t1.b.
    

    For more details, refer to the UPDATE topic.

Warning
  • Turning off multi-statement transactions makes per-partition components of a distributed transaction commit independently without waiting for other partitions to finish and potentially trigger rollback. This removes the common row lock deadlock (transaction A waiting for transaction B on partition 1, but B waiting for A on partition 2) at the cost of potentially letting failed queries commit partial results.
  • If the values inserted are dependent on multiple statements executing concurrently, then inconsistent data may get inserted.
  • In case of failure, it is not possible to rollback the entire multi-upsert inside one transaction.

ERROR 1706 (HY000): Leaf Error (127.0.0.1:3307): Feature ‘INSERT IGNORE … ON DUPLICATE KEY UPDATE’ is not supported by MemSQL

Issue

When you use the IGNORE modifier in a INSERT statement, MemSQL ignores records with duplicate key entries and only inserts records with unique key values. The ON DUPLICATE KEY UPDATE clause, instead of ignoring the insert, performs an update on the row with the duplicate key value. Using both of them in same query leads to a dilemma: whether to ignore the insert or update the record. To prevent any ambiguity in data, MemSQL returns an error, instead of choosing one operation over another.

Solution

Either use the IGNORE modifier or ON DUPLICATE KEY UPDATE clause in INSERT statements, but not both. For more information, see INSERT