TiDB sharding is a complex beast, and understanding how it handles data distribution across nodes is key to performance. The surprising truth is that TiDB doesn’t actually shard data in the traditional sense; instead, it manages data in "Regions" and relies on a component called the Placement Driver (PD) to orchestrate their placement and movement.

Let’s see this in action. Imagine you have a simple table:

CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    email VARCHAR(100)
);

When you insert data into this table, TiDB doesn’t just dump it all onto one TiKV node. Instead, it divides the data into Regions. A Region is the basic unit of data storage and management in TiKV. Each Region typically contains a contiguous range of keys for a specific table.

Here’s a simplified view of what might be happening under the hood as data is inserted:

  • Initial Region Creation: When the users table is created, PD might create an initial Region for it. As data is inserted, this Region will grow.
  • Region Splitting: Once a Region reaches a certain size (e.g., 96 MiB by default), TiKV automatically splits it into two smaller Regions. This is a crucial mechanism for distributing data and preventing any single Region from becoming a bottleneck. For example, if your users table has IDs 1-1000, they might initially be in one Region. As you insert more users, say up to ID 5000, the Region containing IDs 1-5000 might split into two: one for 1-2500 and another for 2501-5000.
  • Region Distribution: PD is responsible for ensuring these Regions are spread across your TiKV nodes. It monitors the load and storage on each TiKV instance and decides where new Regions should be created or existing ones should be moved.

The Placement Driver (PD) is the brain of this operation. It’s a distributed service that manages metadata for the entire TiDB cluster, including the locations of Regions. PD doesn’t store data itself; it just knows where the data is.

The PD Scheduler is a set of components within PD that actively manages Region placement. It has several sub-schedulers:

  • RegionSplitScheduler: This scheduler monitors Region sizes. When a Region exceeds the configured max-region-size (default 96 MiB), it triggers a split.
  • RegionBalanceScheduler: This is the workhorse for load balancing. It monitors the storage usage and read/write load across TiKV nodes. If one TiKV node has significantly more data or traffic than others, this scheduler will orchestrate the movement of Regions to more evenly distribute the load.
  • LeaderBalanceScheduler: Each Region has a leader, which handles all read and write requests for that Region. This scheduler ensures that Region leaders are also balanced across TiKV nodes, preventing any single node from being overwhelmed by leadership duties.

Let’s look at a configuration example for PD schedulers. You can adjust their behavior via PD’s API or configuration files. For instance, to adjust the max-region-size that triggers a split, you’d typically interact with PD’s configuration.

{
  "schedule-config": {
    "max-region-size": 128, // In MiB
    "split-threshold": "100%", // Trigger split when region reaches max-region-size
    "region-balance-skip-limit": "100MB", // Don't balance regions smaller than this
    "leader-balance-threshold": "1", // Balance leaders if node has 1 more leader than average
    // ... other scheduler configurations
  }
}

In this snippet, we’ve increased max-region-size to 128 MiB. This means Regions will grow larger before splitting, potentially reducing the number of Regions in the cluster but also increasing the chance of a single Region becoming a hot spot if it’s heavily accessed. The region-balance-skip-limit of 100MB means that Regions smaller than 100MB won’t be considered for balancing by the RegionBalanceScheduler, which can prevent excessive small Region movements.

The crucial insight is that TiDB’s "sharding" is an emergent property of Region management and PD’s scheduling. It’s not about explicitly defining shard keys or ranges in your application logic. Instead, you focus on your schema and data access patterns, and PD works to distribute the underlying Regions efficiently.

A common misconception is that PD directly controls TiKV nodes. In reality, PD is a control plane. TiKV nodes are the data plane. When PD decides a Region needs to be moved, it sends instructions to the relevant TiKV nodes, and the TiKV nodes themselves execute the data migration process. This involves copying data from the source TiKV to the target TiKV and then updating PD with the new Region location.

The most subtle aspect of PD scheduling is its reactive nature. Schedulers don’t predict future load; they respond to the current state reported by TiKV. This means that during sudden traffic spikes, there can be a lag before the schedulers can rebalance effectively, leading to temporary performance degradation on hot spots.

The next logical step after understanding Region management is to explore how TiDB handles hot spot mitigation and query routing based on Region leaders.

Want structured learning?

Take the full Sharding course →