Consistent hashing is a surprisingly effective way to distribute data across a cluster without needing to rebalance everything when nodes join or leave.

Let’s see it in action. Imagine we have a simple key-value store, and we want to shard our data across three nodes: node-1, node-2, and node-3. We’ll use Python for a quick demonstration.

import hashlib
import bisect

class ConsistentHashRing:
    def __init__(self, nodes=None, replicas=100):
        self.replicas = replicas
        self.ring = dict()
        self.sorted_keys = []
        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash(self, key):
        return int(hashlib.md5(str(key).encode('utf-8')).hexdigest(), 16)

    def add_node(self, node):
        for i in range(self.replicas):
            key = self._hash(f"{node}-{i}")
            self.ring[key] = node
            bisect.insort(self.sorted_keys, key)

    def remove_node(self, node):
        for i in range(self.replicas):
            key = self._hash(f"{node}-{i}")
            del self.ring[key]
            self.sorted_keys.remove(key)

    def get_node(self, key):
        if not self.ring:
            return None
        hash_key = self._hash(key)
        # Find the first key in the ring that is >= hash_key
        index = bisect.bisect_left(self.sorted_keys, hash_key)
        if index == len(self.sorted_keys):
            index = 0 # Wrap around
        return self.ring[self.sorted_keys[index]]

# --- Demonstration ---
nodes = ['node-1', 'node-2', 'node-3']
hash_ring = ConsistentHashRing(nodes, replicas=100)

print("Initial distribution:")
data_keys = ['user:1001', 'product:500', 'order:abc', 'session:xyz', 'cache:123']
for key in data_keys:
    print(f"Key '{key}' maps to: {hash_ring.get_node(key)}")

print("\nAdding node-4:")
hash_ring.add_node('node-4')
for key in data_keys:
    print(f"Key '{key}' maps to: {hash_ring.get_node(key)}")

print("\nRemoving node-2:")
hash_ring.remove_node('node-2')
for key in data_keys:
    print(f"Key '{key}' maps to: {hash_ring.get_node(key)}")

The core problem consistent hashing solves is data movement. In traditional sharding (like modulo hashing: hash(key) % num_nodes), adding or removing a single node forces you to rehash and move roughly 50% of your data. This is a massive operation for large datasets. Consistent hashing minimizes this by only reassigning data from the node that was removed (or to the node that was added).

Internally, it works by mapping both nodes and data keys onto a conceptual ring (often represented by a sorted list of hash values). When you need to find which node a key belongs to, you hash the key and then find the next node clockwise on the ring. Adding a node means inserting its virtual replicas onto the ring; data that previously mapped to the node after the new node will now map to the new node. Removing a node simply means deleting its virtual replicas; data that mapped to the removed node will now map to the next node clockwise. The replicas parameter (often called virtual nodes) is crucial: it ensures that even with a small number of physical nodes, the load is distributed relatively evenly across the ring. More replicas mean better distribution but also more overhead.

The bisect module in Python is key here. bisect.bisect_left efficiently finds the insertion point for a hash value in our sorted list of node hash keys, giving us the index of the next node on the ring. This logarithmic time complexity for lookups is what makes the system performant.

The trick to minimizing data movement isn’t just the ring itself, but how you place the virtual nodes. Each physical node is represented by many virtual nodes scattered across the hash ring. When a physical node is added, its virtual nodes are inserted. When a physical node is removed, its virtual nodes are deleted. The data keys that were assigned to the removed physical node are now reassigned to the next available virtual node on the ring, which belongs to a different physical node. Because the virtual nodes are spread out, a single physical node’s departure only affects the keys that happened to hash between its virtual nodes and the next ones in sequence, which is a much smaller fraction of the total data than a simple modulo operation would imply.

The next challenge is handling hot spots, where a small subset of keys receives a disproportionate amount of traffic, even with a good hashing strategy.

Want structured learning?

Take the full Sharding course →