S3 Cross-Region Replication (CRR) isn’t just about copying objects; it’s a sophisticated mechanism that fundamentally changes how you think about data durability and availability across AWS regions.
Let’s see it in action. Imagine you have a bucket named my-source-bucket-us-east-1 in us-east-1 and you want to replicate its contents to my-destination-bucket-eu-west-1 in eu-west-1 for disaster recovery.
First, you need to ensure both buckets exist and that IAM permissions are set up correctly. The user or role initiating the replication needs s3:GetObject, s3:ListBucket, and s3:PutObject permissions on the source bucket, and s3:PutObject on the destination bucket. Crucially, the destination bucket must have a bucket policy that grants the S3 replication service (replication.s3.amazonaws.com) permission to replicate objects into it.
Here’s a simplified example of a destination bucket policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "replication.s3.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-destination-bucket-eu-west-1/*"
}
]
}
Then, you configure the replication rule on the source bucket. You can do this via the AWS Management Console or the AWS CLI. When setting up the rule, you specify the destination bucket ARN and optionally a prefix to only replicate objects within a specific path. You also choose whether to replicate existing objects (a one-time operation) and whether to apply this rule to all objects or a subset.
The core of CRR configuration involves defining a replication rule. For example, using the AWS CLI to add a replication configuration to my-source-bucket-us-east-1:
aws s3api put-bucket-replication --bucket my-source-bucket-us-east-1 --replication-configuration '{
"RoleArn": "arn:aws:iam::111122223333:role/S3ReplicationRole",
"Rules": [
{
"ID": "ReplicateToEU",
"Status": "Enabled",
"Priority": 1,
"Destination": {
"Bucket": "arn:aws:s3:::my-destination-bucket-eu-west-1",
"Account": "111122223333"
},
"SourceSelectionCriteria": {
"SseKmsEncryptedObjects": {
"Status": "All"
}
}
}
]
}'
In this CLI command:
RoleArn: This is the ARN of the IAM role that S3 will assume to perform the replication. This role needs permissions to read from the source and write to the destination.ID: A unique identifier for the replication rule.Status: Set toEnabledto activate the rule.Priority: Used when you have multiple rules; lower numbers have higher priority.Destination: Specifies the ARN of the destination bucket and the AWS account ID it resides in.SourceSelectionCriteria: This example shows how you can filter replication based on object properties, such as only replicating objects that are SSE-KMS encrypted.
Once configured, S3 handles the replication asynchronously. When an object is created or updated in the source bucket, S3 initiates a copy operation to the destination bucket. This process respects object metadata, versioning (if enabled on both buckets), and access control lists.
CRR is particularly powerful because it operates at the S3 service level, meaning it’s not dependent on your applications or EC2 instances running. S3 itself ensures that objects are copied. This is crucial for disaster recovery, as it provides an independent copy of your data in a geographically separate location, protecting against region-wide outages.
A subtle but critical aspect of CRR is how it handles object versioning. If versioning is enabled on both the source and destination buckets, CRR replicates all versions of an object. When an object is deleted in the source bucket and versioning is enabled, CRR does not replicate the delete marker. This is a key difference from replicating all changes; it’s designed to preserve data in the destination. If you want to replicate deletions, you’d typically need to implement a separate process or use S3 Batch Operations.
The real magic is that S3 automatically retries replication if it fails due to transient network issues or temporary service unavailability. You don’t need to build this retry logic yourself. The aws s3api command above defines a replication rule; to check the status of replication for a specific object, you can use aws s3api get-object-tagging --bucket my-source-bucket-us-east-1 --key object.txt and look for tags like s3:x-amz-version-id which are added by replication.
The next concept you’ll likely encounter is managing replication for a large number of existing objects or dealing with replication failures.