Outdated Version

You are viewing an older version of this section. View current production version.

Understanding Orphaned Partitions min read


Orphaned partitions are partition databases present on leaf nodes that the master aggregator node doesn’t consider part of the cluster (they’re neither master nor replica partition databases). This means the master aggregator node will drop this database if it ever gets in the way of something it wants to do.

Why is it Important to Know About Orphan Partitions?

It is possible in some crash scenarios to have orphaned partitions that are the last valid copies of a partition. In this case run ATTACH PARTITION to attach the orphan back into the cluster right away. Do not run high level commands (REBALANCE PARTITIONS, RESTORE REDUNDANCY, DETACH LEAF, etc.). High level commands may drop the orphan.

Example:

SHOW CLUSTER STATUS;
| Host      | Port  | Database | Role      | State       | Position | Details |
+-----------+-------+----------+-----------+-------------+----------+---------+
| 127.0.0.1 | 10001 | x_test   | Reference | replicating | 0:57     |         |
| 127.0.0.1 | 10001 | x_test_0 | Master    | online      | 0:10     |         |
| 127.0.0.1 | 10001 | x_test_1 | Slave     | replicating | 0:10     |         |
| 127.0.0.1 | 10002 | x_test   | Reference | replicating | 0:57     |         |
| 127.0.0.1 | 10002 | x_test_0 | Slave     | replicating | 0:10     |         |
| 127.0.0.1 | 10002 | x_test_1 | Orphan    | online      | 0:10     |         |
SHOW PARTITIONS;
+---------+-----------+-------+--------+--------+
| Ordinal | Host      | Port  | Role   | Locked |
+---------+-----------+-------+--------+--------+
| 0       | 127.0.0.1 | 10001 | Master | 0      |
| 0       | 127.0.0.1 | 10002 | Slave  | 0      |
| 1       | 127.0.0.1 | 10001 | Slave  | 0      |
+---------+-----------+-------+--------+--------+

We are missing a master for partition 1, but we have an online orphan for partition 1. That is the master we need, so attach it:

ATTACH PARTITION x_test:1 on '127.0.0.1':10002;
SHOW CLUSTER STATUS;
+-----------+-------+----------+-----------+-------------+----------+---------+
| Host      | Port  | Database | Role      | State       | Position | Details |
+-----------+-------+----------+-----------+-------------+----------+---------+
| 127.0.0.1 | 10001 | x_test   | Reference | replicating | 0:57     |         |
| 127.0.0.1 | 10001 | x_test_0 | Master    | online      | 0:10     |         |
| 127.0.0.1 | 10001 | x_test_1 | Slave     | replicating | 0:10     |         |
| 127.0.0.1 | 10002 | x_test   | Reference | replicating | 0:57     |         |
| 127.0.0.1 | 10002 | x_test_0 | Slave     | replicating | 0:10     |         |
| 127.0.0.1 | 10002 | x_test_1 | Master    | online      | 0:10     |         |
+-----------+-------+----------+-----------+-------------+----------+---------+
SHOW PARTITIONS;
+---------+-----------+-------+--------+--------+
| Ordinal | Host      | Port  | Role   | Locked |
+---------+-----------+-------+--------+--------+
| 0       | 127.0.0.1 | 10001 | Master | 0      |
| 0       | 127.0.0.1 | 10002 | Slave  | 0      |
| 1       | 127.0.0.1 | 10001 | Slave  | 0      |
| 1       | 127.0.0.1 | 10002 | Master | 0      |
+---------+-----------+-------+--------+--------+

If your cluster has many partitions, you may want to count its partitions by role to help identify any orphaned partitions.

Why Do Orphan Partitions Exist?

The master aggregator avoids dropping partition databases that it doesn’t know the origins of just in case this data is needed. This is a safe guard against data loss due to a bug or some other strange configuration issue that caused the master aggregators view of the world to become incorrect and the orphan contains actual data (this is very rare, but it does happen). The easiest way to create orphans is to shutdown a leaf and then have REBALANCE PARTITIONS move partitions away from the pair with the offline leaf. When the leaf comes back online it will have orphans (in this case the orphans are actually just stale data).

To remove orphans you’re sure you don’t need, run:

EXPLAIN CLEAR ORPHAN DATABASES;

to see which partitions will be dropped, followed by:

CLEAR ORPHAN DATABASES;

Make sure to have run SHOW PARTITIONS and SHOW CLUSTER STATUS and have verified that all partitions are accounted for before doing this. CLEAR ORPHAN DATABASES can’t be undone.