The most surprising thing about re-sharding is that it’s not about moving data, but about moving the keys.

Let’s look at a simple Redis cluster. We have three master nodes, each responsible for a range of hash slots.

127.0.0.1:7000 master
127.0.0.1:7001 master
127.0.0.1:7002 master

Let’s say node 7000 owns slots 0-5460, 7001 owns 5461-10922, and 7002 owns 10923-16383. When you SET mykey value, Redis calculates CRC16("mykey") % 16384. If that number falls into 7000’s range, the key goes there. Simple.

Now, imagine you need to add a new node, 7003, to handle more load. You bring it up, and it’s empty. Redis knows about it, but it doesn’t own any slots yet.

127.0.0.1:7000 master (0-5460)
127.0.0.1:7001 master (5461-10922)
127.0.0.1:7002 master (10923-16383)
127.0.0.1:7003 master (no slots)

To rebalance, we don’t just copy data. We tell Redis to transfer ownership of slots. We might tell 7000 to give up slots 0-2000 to 7003.

This is where the magic happens. Redis doesn’t immediately move data. Instead, it marks those slots as "migrating" from 7000 to 7003.

On 7000:

  • If a client tries to GET a key in a migrating slot, 7000 intercepts it. It forwards the request to 7003.
  • If 7003 has the key, it sends the value back to the client.
  • Crucially, it also sends a special MIGRATING reply to 7000’s client. 7000 then writes this key to 7003 and updates its local slot map.

On 7003:

  • When 7003 receives a forwarded request for a migrating slot, it checks if it has the key.
  • If it does, it replies directly to the client.
  • If it doesn’t have the key yet (because the original client wrote it to 7000 after the migration started, but before 7000 forwarded it), it replies with a BUSYKEY error. The client then retries, and this time 7000 will forward the request again, and 7003 will likely have it.

Simultaneously, 7003 starts receiving REPLICATING commands from 7000 for keys in the slots it’s taking over. This is an asynchronous replication process. 7003 becomes a replica of those specific slots on 7000.

Once 7003 has a full copy of the data for those slots (indicated by SYNC completion), the slots are officially transferred. 7000 stops serving them, and 7003 starts serving them directly. The client requests now go straight to 7003 without forwarding.

This "migrating" and "replicating" dance allows clients to continue reading and writing data without interruption. The actual data movement happens in the background, managed by Redis itself, while the cluster remains available. The key is that Redis manages the client redirection and background replication based on slot state changes.

The real power comes from using the CLUSTER command for this. You’d use commands like CLUSTER ADDSLOTS and CLUSTER DELSLOTS on the receiving node, and CLUSTER SETSLOT <slot> IMPORTING <source_node_id> and CLUSTER SETSLOT <slot> MIGRATING <target_node_id> on the source node. Redis handles the forwarding and replication automatically.

The crucial part is that a client library must be aware of the CLUSTER command and handle MOVED and ASK redirections. If your client doesn’t support cluster mode, re-sharding will break your application.

The next thing you’ll likely wrestle with is how to manage slot distribution for optimal performance across nodes, not just availability.

Want structured learning?

Take the full Sharding course →