The most surprising thing about Route 53 blue-green DNS cutovers using weighted routing is that the "cutover" often happens gradually, not all at once, and that’s usually a good thing.

Imagine you’ve got your live production website running at my-app.example.com. You want to deploy a new version, but you’re nervous about downtime. A common strategy is a blue-green deployment: you spin up the new version (green) alongside the old one (blue), test it thoroughly, and then switch traffic. For DNS, Route 53’s weighted routing is your primary tool here.

Here’s how it works in practice. Let’s say you have a Route 53 record set for my-app.example.com pointing to your blue environment.

{
  "Comment": "Blue-green deployment for my-app.example.com",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "my-app.example.com",
        "Type": "A",
        "TTL": 300,
        "Weight": 100,
        "SetIdentifier": "blue",
        "AliasTarget": {
          "HostedZoneId": "Z1XXXXXXXXXXXXX",
          "DNSName": "blue.elb.amazonaws.com",
          "EvaluateTargetHealth": false
        }
      }
    }
  ]
}

This record directs 100% of traffic to your blue environment. Now, you deploy your green environment, also behind a load balancer, say green.elb.amazonaws.com. You create a new record set for the same DNS name, my-app.example.com, but this time with a different SetIdentifier and a weight of 0.

{
  "Comment": "Blue-green deployment for my-app.example.com",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "my-app.example.com",
        "Type": "A",
        "TTL": 300,
        "Weight": 0,
        "SetIdentifier": "green",
        "AliasTarget": {
          "HostedZoneId": "Z2XXXXXXXXXXXXX",
          "DNSName": "green.elb.amazonaws.com",
          "EvaluateTargetHealth": false
        }
      }
    }
  ]
}

At this point, you have two record sets for my-app.example.com: one for "blue" with 100% weight, and one for "green" with 0% weight. No traffic is hitting green yet because its weight is zero. You’ve thoroughly tested the green environment independently.

The "cutover" is the act of shifting traffic. You do this by modifying the weights. You’ll decrease the weight of the "blue" record and increase the weight of the "green" record. A common initial step is to shift a small percentage, say 10%, to the green environment.

{
  "Comment": "Blue-green deployment for my-app.example.com - shifting traffic",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "my-app.example.com",
        "Type": "A",
        "TTL": 300,
        "Weight": 90,  // Reduced from 100
        "SetIdentifier": "blue",
        "AliasTarget": {
          "HostedZoneId": "Z1XXXXXXXXXXXXX",
          "DNSName": "blue.elb.amazonaws.com",
          "EvaluateTargetHealth": false
        }
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "my-app.example.com",
        "Type": "A",
        "TTL": 300,
        "Weight": 10,  // Increased from 0
        "SetIdentifier": "green",
        "AliasTarget": {
          "HostedZoneId": "Z2XXXXXXXXXXXXX",
          "DNSName": "green.elb.amazonaws.com",
          "EvaluateTargetHealth": false
        }
      }
    }
  ]
}

This change propagates through Route 53. DNS resolvers worldwide will start returning a mix of IP addresses: 90% of the time, they’ll give you an IP for the blue environment, and 10% of the time, for the green. This is where the gradual "cutover" happens. You can monitor your green environment’s performance, error rates, and logs. If everything looks good, you incrementally shift more traffic. You might go to 50/50, then 10/90, and finally, 0/100.

The key here is that the TTL (Time To Live) on your DNS records dictates how quickly clients and intermediate DNS servers will pick up the new weights. A lower TTL (like 60 seconds) means faster propagation, but more DNS queries. A higher TTL (like 300 seconds, or 5 minutes) means slower propagation but fewer queries. You need to balance this.

The "cutover" is complete when the blue environment has a weight of 0 and the green environment has a weight of 100. At this point, you can safely decommission the blue environment.

{
  "Comment": "Blue-green deployment for my-app.example.com - complete",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "my-app.example.com",
        "Type": "A",
        "TTL": 300,
        "Weight": 0,  // All traffic now goes to green
        "SetIdentifier": "blue",
        "AliasTarget": {
          "HostedZoneId": "Z1XXXXXXXXXXXXX",
          "DNSName": "blue.elb.amazonaws.com",
          "EvaluateTargetHealth": false
        }
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "my-app.example.com",
        "Type": "A",
        "TTL": 300,
        "Weight": 100, // All traffic now goes to green
        "SetIdentifier": "green",
        "AliasTarget": {
          "HostedZoneId": "Z2XXXXXXXXXXXXX",
          "DNSName": "green.elb.amazonaws.com",
          "EvaluateTargetHealth": false
        }
      }
    }
  ]
}

After you’ve confirmed the green environment is stable and handling all traffic, you can then delete the "blue" record set entirely.

The most powerful aspect of this method is the immediate rollback capability. If you notice issues with the green environment after shifting even 10% of traffic, you simply revert the weights back to the previous state (e.g., 100% blue, 0% green). Because the TTL is active, you can effectively "pull the plug" on the green environment’s traffic very quickly.

The subtle detail most people miss is that Route 53 weighted routing doesn’t guarantee a perfect 90/10 split at any given moment. It’s a probabilistic distribution. A single client’s DNS resolver might cache the IP for the old weight for its TTL duration. So, even with a 90/10 split, a user might consistently hit the blue environment for a while if their resolver doesn’t refresh its cache. The actual traffic distribution is an aggregation across many clients and their resolvers.

The next hurdle is often managing the actual application deployment and health checks across both environments simultaneously.

Want structured learning?

Take the full Route53 course →