Directory-based sharding is the way most distributed databases achieve fast, stateless routing to the correct shard.

Let’s imagine we have a database with users, and we want to shard it by user_id. We have 100 shards, and we want to distribute users evenly. A naive approach might be to hash user_id and modulo by 100. This works fine until we need to add or remove shards. If we add a shard, all the hashes change, and we have to move almost all the data. This is a massive, disruptive operation.

Directory-based sharding solves this by introducing a lookup table that maps a shard key (or a range of shard keys) to a specific shard. This lookup table is itself distributed and replicated for availability and performance.

Here’s how it works in practice with a hypothetical distributed key-value store.

System Setup:

  • Shards: 3 physical shards (Shard 0, Shard 1, Shard 2)
  • Shard Key: user_id
  • Routing Logic: A client (or a routing layer) needs to determine which shard holds the data for user_id = 12345.

The Lookup Table:

The lookup table might look something like this:

Shard Key Range (Min user_id) Max user_id Target Shard
0 33333 Shard 0
33334 66666 Shard 1
66667 99999 Shard 2

This table is crucial. When a client wants to read or write data for user_id = 12345, it consults this lookup table. It finds the row where 12345 falls between Min user_id and Max user_id, which in this case is the first row. The table then tells the client to direct the request to Shard 0.

Adding a Shard (Rebalancing):

The beauty of this is in rebalancing. Suppose we want to add Shard 3. We don’t need to re-hash everything. We can simply update the lookup table.

  1. Create Shard 3: Provision a new physical shard.

  2. Adjust Ranges: We might decide to split Shard 2’s range. The new table could look like this:

    Shard Key Range (Min user_id) Max user_id Target Shard
    0 33333 Shard 0
    33334 66666 Shard 1
    66667 80000 Shard 2
    80001 99999 Shard 3
  3. Data Migration: Now, and only now, we start moving data. We’d instruct Shard 2 to start migrating users with user_ids from 80001 to 99999 over to Shard 3. During this migration, the lookup table still points to Shard 2 for this range. Once the data is copied, and we’ve verified consistency, we can update the lookup table again to point the new range to Shard 3. The old Shard 2 can then be decommissioned or repurposed.

This tiered approach allows for flexible scaling without massive data reshuffling for every topology change. The lookup table itself is managed by a coordination service (like ZooKeeper or etcd) or is a part of the database’s metadata management.

Internal Mechanics:

The lookup table isn’t just a static file. It’s a dynamic, often replicated data structure. When a request arrives at a routing layer or a client library, it performs a lookup:

// Pseudocode for client-side routing
function route_request(user_id, operation) {
  lookup_table = get_current_lookup_table(); // Fetches from a metadata service
  for (entry in lookup_table) {
    if (user_id >= entry.min_id && user_id <= entry.max_id) {
      target_shard = entry.shard_address;
      break;
    }
  }
  send_to_shard(target_shard, user_id, operation);
}

The get_current_lookup_table() operation is critical. It needs to be fast and consistent. Many systems cache this table locally and subscribe to updates from the metadata service to avoid a network round trip for every query. Cache invalidation is key here; if the table changes, the cache must be refreshed promptly.

The ranges are typically contiguous and non-overlapping, covering the entire possible key space. The max_id is often set to the maximum possible value for the data type of the shard key, or a very large number, to ensure the last range covers all remaining keys.

The most surprising thing about directory-based sharding is how the ranges are often not determined by a simple hash function but by a flexible, configurable policy that allows for manual intervention and gradual rebalancing. This means the shard boundaries can be arbitrary, chosen to balance data volume, access patterns, or even administrative convenience, rather than being dictated by the deterministic output of a hash algorithm. For example, a popular user ID range might be assigned to a dedicated shard with more resources, even if it violates a purely numerical split.

The next thing you’ll grapple with is how to handle cross-shard transactions or queries that need to aggregate data from multiple shards.

Want structured learning?

Take the full Sharding course →