Sharding a database means splitting it into smaller, more manageable pieces called shards. This is usually done to improve performance and scalability. But when it comes to backing up sharded databases, things get a bit more complicated.

Let’s walk through a common sharding strategy and how to back it up. Imagine you have a large e-commerce application with a sharded database. The database is sharded by customer_id. Each shard contains a subset of customer data, and the shard key (customer_id) determines which shard a particular customer’s data resides on.

Here’s a simplified view of how this might look:

  • Shard 1: customer_id 0-999
  • Shard 2: customer_id 1000-1999
  • Shard 3: customer_id 2000-2999
  • …and so on.

You also have a mongos instance, which acts as a router and query orchestrator. It directs client requests to the appropriate shard based on the shard key.

The Challenge of Sharded Backups

The core challenge with backing up sharded databases is ensuring consistency. If you back up each shard independently and at slightly different times, you might end up with a backup where different parts of your data are out of sync. For example, a customer’s order might be backed up from Shard 2, but their customer profile might be backed up from Shard 5, and these backups might represent slightly different points in time, leading to data corruption when you try to restore.

A Consistent Backup Strategy: Snapshotting

The most robust way to back up a sharded database is to take a consistent snapshot across all shards simultaneously. This ensures that all data is captured at the exact same moment.

Let’s assume you’re using MongoDB for this example. MongoDB’s mongodump tool, when used with the --oplog flag and run against the mongos instance, is designed to handle this consistency.

Step-by-Step Backup Process

  1. Identify your mongos instance: This is your entry point for coordinating operations across the sharded cluster. Let’s say it’s running on mongos.example.com:27017.

  2. Use mongodump with --oplog: The --oplog flag is crucial. It tells mongodump to record the current oplog (operation log) position at the start of the backup. This oplog contains a record of all write operations. By including the oplog, you can replay these operations during a restore to bring the data up to a consistent point in time, even if some operations were in flight during the snapshot.

    mongodump --host mongos.example.com:27017 --username backupuser --password 'your_secure_password' --authenticationDatabase admin --db my_ecommerce_db --out /backup/my_ecommerce_db_$(date +%Y%m%d_%H%M%S) --oplog
    
    • --host mongos.example.com:27017: Specifies the mongos instance to connect to.
    • --username backupuser --password 'your_secure_password' --authenticationDatabase admin: Your authentication credentials. It’s best practice to use a dedicated backup user with read-only permissions on the database and the clusterMonitor role.
    • --db my_ecommerce_db: The specific database you want to back up.
    • --out /backup/my_ecommerce_db_$(date +%Y%m%d_%H%M%S): The directory where the backup will be stored. The $(date +%Y%m%d_%H%M%S) part automatically creates a timestamped directory for each backup, which is excellent for organization.
    • --oplog: This is the magic flag for consistency. It ensures that the backup captures the oplog entries from the point the backup starts until it finishes.
  3. What mongodump --oplog does: When you run mongodump against mongos with --oplog, it does the following:

    • It connects to mongos.
    • It queries mongos to get the list of all shards in the cluster.
    • It records the current oplog timestamp from the config server (which mongos knows about).
    • It then connects to each shard replica set individually and dumps the data.
    • Crucially, it ensures that the data dumped from each shard corresponds to the same point in time as recorded by the oplog timestamp.
    • It also dumps the oplog entries themselves.

Restoring a Sharded Backup

Restoring is a multi-step process that leverages the --oplog data.

  1. Ensure your sharded cluster is set up: You need a running sharded cluster with the same number of shards and shard key configuration as the original.

  2. Restore the data to each shard: You’ll use mongorestore to put the data back onto each individual shard’s primary. This is where it differs from a single-node restore. You cannot restore directly to mongos.

    For each shard (e.g., shard1.example.com:27017, shard2.example.com:27017, etc.), you would run:

    mongorestore --host shard1.example.com:27017 --username restoreuser --password 'your_secure_password' --authenticationDatabase admin --db my_ecommerce_db /backup/my_ecommerce_db_YYYYMMDD_HHMMSS/my_ecommerce_db
    

    You repeat this for every shard, pointing mongorestore to the primary of each shard’s replica set and specifying the data directory for that shard from your mongodump output.

  3. Restore the oplog: After restoring the data to all shards, you need to replay the oplog entries. This is done by connecting to the mongos instance.

    mongorestore --host mongos.example.com:27017 --username restoreuser --password 'your_secure_password' --authenticationDatabase admin --oplogReplay /backup/my_ecommerce_db_YYYYMMDD_HHMMSS/oplog.bson
    
    • --oplogReplay: This flag tells mongorestore to read the oplog entries and apply them to the cluster.
    • oplog.bson: This is the file containing the oplog entries that mongodump captured.

The "One More Thing"

A common pitfall is assuming that mongodump run against mongos automatically handles the distribution of data to the correct shards during restore. This is not the case. mongorestore needs to be pointed at the primary of each shard’s replica set to restore the data for that specific shard. The --oplogReplay on mongos then brings the whole cluster up to a consistent state.

Next Steps

Once your sharded database is consistently backed up and you’ve practiced restores, your next logical step is to automate this process, perhaps using cron jobs or a dedicated backup solution, and to implement point-in-time recovery strategies using more frequent oplog backups.

Want structured learning?

Take the full Sharding course →