Pulumi state backends aren’t just places to store your infrastructure’s current configuration; they’re the linchpin of collaboration and disaster recovery for your Pulumi deployments.

Let’s see Pulumi in action, managing a simple S3 bucket. First, we’ll create a Pulumi.yaml to define our project and specify the S3 backend.

name: my-s3-project
runtime: python
description: A minimal Pulumi program to manage an S3 bucket.
backend:
  url: gs://my-pulumi-state-bucket/my-s3-project # Example for GCS

Now, a Python program (__main__.py) to create the bucket:

import pulumi
import pulumi_aws as aws

# Create an AWS S3 bucket
bucket = aws.s3.Bucket("my-app-bucket")

# Export the bucket name
pulumi.export("bucket_name", bucket.id)

When you run pulumi up, Pulumi interacts with the specified backend. If it’s S3, it’ll be an S3 bucket; for GCS, a GCS bucket; for Azure Blob, a blob container; and for Pulumi Cloud, our managed service. This backend stores a state.json file, which is a JSON representation of your infrastructure’s current state – all the resources Pulumi knows about, their properties, and their dependencies.

The core problem Pulumi state backends solve is maintaining a single source of truth for your infrastructure. Without a shared backend, multiple developers running pulumi up simultaneously would have no way of knowing what the other has deployed, leading to conflicts, lost updates, and an inconsistent infrastructure state. The backend acts as a lock during pulumi up operations, ensuring only one person is modifying the state at a time.

Here’s a breakdown of the common backend types:

  • S3 (Amazon Simple Storage Service):

    • Setup: You create an S3 bucket (e.g., my-pulumi-state-bucket).
    • Configuration: In Pulumi.yaml:
      backend:
        url: s3://my-pulumi-state-bucket
      
    • How it works: Pulumi reads and writes the state.json file to this bucket. It uses S3’s object locking capabilities implicitly to prevent concurrent writes.
  • GCS (Google Cloud Storage):

    • Setup: You create a GCS bucket (e.g., my-pulumi-state-bucket).
    • Configuration: In Pulumi.yaml:
      backend:
        url: gs://my-pulumi-state-bucket
      
    • How it works: Similar to S3, Pulumi uses GCS as a simple object store for state.json. GCS handles concurrency through object versioning and conditional writes.
  • Azure Blob Storage:

    • Setup: You create a storage account and a container within it (e.g., pulumi-state-container).
    • Configuration: In Pulumi.yaml:
      backend:
        url: azblob://<your-storage-account-name>/pulumi-state-container
      
    • How it works: Pulumi leverages Azure Blob Storage to store the state file. Azure Blob Storage provides the necessary APIs for atomic read-modify-write operations.
  • Pulumi Cloud (Managed Service):

    • Setup: No explicit setup required; it’s managed by Pulumi.
    • Configuration: This is the default if no backend is specified.
      # No backend section needed, or
      backend:
        url: file://~/.pulumi/
      
      (The file:// backend is for local state, often used during initial development before committing to a remote backend).
    • How it works: Pulumi’s SaaS offering provides a highly available, secure, and collaborative state backend with additional features like secrets management and deployment history.

When you run pulumi login or configure your backend, Pulumi authenticates against the chosen cloud provider. For S3, it uses your AWS credentials; for GCS, your GCP credentials; for Azure Blob, your Azure credentials. These credentials need appropriate permissions to read and write objects in the designated storage location.

The state file itself is a JSON document that looks something like this (simplified):

{
  "version": 3,
  "secretsManager": null,
  "encryptedKey": null,
  "recentPastOutputs": {},
  "resources": [
    {
      "urn": "urn:pulumi:dev::my-s3-project::aws:s3/bucket:Bucket::my-app-bucket",
      "type": "aws:s3/bucket:Bucket",
      "parent": "",
      "outputs": {
        "bucket": "my-app-bucket-a1b2c3d4",
        "bucketDomainName": "my-app-bucket-a1b2c3d4.s3.amazonaws.com",
        // ... other properties
      },
      "dependencies": [],
      "protect": false,
      "ignoreChanges": [],
      "retainOnDelete": false,
      "propertyDependencies": {}
    }
  ],
  "metadata": {
    "tool": {
      "version": "v3.0.0",
      "name": "pulumi"
    },
    "guard": null,
    "backend": {
      "url": "s3://my-pulumi-state-bucket"
    }
  }
}

The resources array is the critical part. Each object represents a managed resource. The urn (Uniform Resource Name) is a unique identifier for that resource within Pulumi. The outputs are the attributes of the resource as known by Pulumi after its creation or update.

The state file is the "source of truth" for Pulumi. If you delete the state.json file from your backend and run pulumi up, Pulumi will think all your infrastructure is gone and will attempt to recreate it. Conversely, if you manually delete a resource from your cloud provider that is managed by Pulumi, and then run pulumi up, Pulumi will detect the drift and try to recreate the missing resource.

A common point of confusion is that the state.json file contains sensitive information if your stack is configured to encrypt secrets locally. When using Pulumi Cloud or a backend that supports server-side encryption (like S3 with SSE-KMS), Pulumi encrypts the entire state file before storing it. For local backends (file://) or cloud backends without server-side encryption enabled, Pulumi encrypts individual secrets within the state file using a key derived from your stack’s passphrase. This means the state.json file itself is not always directly readable as plain JSON if secrets are involved and encryption is active.

The next step after mastering state backends is understanding how Pulumi handles secrets management across these different backends.

Want structured learning?

Take the full Pulumi course →