The most surprising thing about MongoDB sharding is that it doesn’t actually do the sharding itself; it just orchestrates it.
Let’s watch it in action. Imagine we have a users collection that’s getting too big for a single server. We want to shard it across two shards, shard1 and shard2.
First, we need our config servers. These are the brains of the operation, holding the metadata about which data lives on which shard. Let’s say we have three config servers running on cfg1:27019, cfg2:27019, and cfg3:27019.
mongos --configdb cfg1:27019,cfg2:27019,cfg3:27019 --bind_ip 0.0.0.0 --port 27018
This mongos instance is our router. It’s the entry point for all application traffic. When an application queries for user data, it hits the mongos. The mongos then consults the config servers to figure out which shard has the data and forwards the request there. It also handles writes, directing them to the correct shard.
Now, our actual data lives on our shards. We’ll have at least two mongod instances acting as shards. Let’s say shard1 is s1:27017 and shard2 is s2:27017.
Each shard needs to know it’s part of a sharded cluster. We start them with a --shardsvr flag and tell them which config servers to talk to:
mongod --shardsvr --replSet shard1-rs --configdb cfg1:27019,cfg2:27019,cfg3:27019 --bind_ip 0.0.0.0 --port 27017
mongod --shardsvr --replSet shard2-rs --configdb cfg1:27019,cfg2:27019,cfg3:27019 --bind_ip 0.0.0.0 --port 27017
Notice the --replSet flag. Shards are almost always configured as replica sets for high availability. So, shard1-rs might have s1a:27017, s1b:27017, s1c:27017, and shard2-rs would have its own set of replicas.
Once everything is running, we connect to one of the mongos routers and tell MongoDB to shard our users collection:
// Connect to mongos (e.g., mongosh --port 27018)
sh.enableSharding("mydatabase")
sh.shardCollection("mydatabase.users", { "userId": 1 })
This is where the shard key comes in. We chose userId as the shard key. MongoDB uses this field to distribute documents across the shards. It uses a chunk-based system. Initially, all documents are in one chunk. As data grows, MongoDB splits chunks. For a hashed shard key like userId, it uses a hash of the userId to determine which chunk a document belongs to. For a ranged shard key, it uses the range of userId values.
So, when a userId of 12345 comes in, the mongos hashes 12345 and sees that this hash maps to a chunk residing on shard1. The mongos then sends the write to shard1. If a userId of 67890 hashes to a different range, it might go to shard2.
The config servers maintain the mapping of these chunks to shards. They know that the chunk containing userId values from 0 to 50000 is on shard1, and the chunk for 50001 to 100000 is on shard2.
The critical piece that people often overlook is how MongoDB balances these chunks. When one shard starts to get disproportionately more data or load, the balancer (a background process running within the mongos) kicks in. It analyzes the chunk distribution and migrates chunks from overloaded shards to underloaded ones. This ensures that the workload is spread evenly. You can see the balancer status with sh.getBalancerState().
The choice of shard key is paramount. A poorly chosen shard key can lead to hot spots where one shard receives the vast majority of traffic, defeating the purpose of sharding. For instance, if you shard by timestamp and all your writes happen within the last hour, all new data will go to the shard responsible for that time range.
The next hurdle you’ll likely encounter is understanding how to perform shard-aware queries and operations, especially when dealing with compound shard keys or operations that might span multiple chunks.