Netflix Conductor is a system designed to orchestrate complex, distributed workflows. It’s not just about running tasks in sequence; it’s about managing state, handling failures, and ensuring that long-running, multi-step processes complete reliably. Think of it as a conductor for an orchestra, where each instrument is a microservice, and the conductor ensures they play together harmoniously to produce a symphony of business logic.

Let’s look at Conductor in action. Imagine an e-commerce order processing system. When a customer places an order, it’s not a single event. It triggers a cascade of actions: inventory check, payment processing, shipping label generation, notification to the customer, and so on. Conductor can manage this entire sequence, defining each step as a "task" within a "workflow" called a "saga."

Here’s a simplified example of a Conductor workflow definition (written in JSON, but it can also be defined via its UI or SDKs):

{
  "name": "ecommerceOrderProcessing",
  "version": 1,
  "tasks": [
    {
      "name": "checkInventory",
      "taskReferenceName": "checkInventory_1",
      "type": "SIMPLE",
      "inputParameters": {
        "productId": "${workflow.input.productId}",
        "quantity": "${workflow.input.quantity}"
      },
      "optional": false
    },
    {
      "name": "processPayment",
      "taskReferenceName": "processPayment_1",
      "type": "SIMPLE",
      "inputParameters": {
        "orderId": "${workflow.input.orderId}",
        "paymentDetails": "${workflow.input.paymentDetails}"
      },
      "startDelay": 5,
      "optional": false,
      "dependsOn": [
        "checkInventory_1"
      ]
    },
    {
      "name": "generateShippingLabel",
      "taskReferenceName": "generateShippingLabel_1",
      "type": "SIMPLE",
      "inputParameters": {
        "orderId": "${workflow.input.orderId}",
        "shippingAddress": "${workflow.input.shippingAddress}"
      },
      "optional": false,
      "dependsOn": [
        "processPayment_1"
      ]
    },
    {
      "name": "sendOrderConfirmation",
      "taskReferenceName": "sendOrderConfirmation_1",
      "type": "SIMPLE",
      "inputParameters": {
        "customerId": "${workflow.input.customerId}",
        "orderId": "${workflow.input.orderId}"
      },
      "optional": false,
      "dependsOn": [
        "generateShippingLabel_1"
      ]
    }
  ],
  "schemaVersion": 2
}

In this definition:

  • name and version identify this specific workflow.
  • tasks is an array of individual steps.
  • taskReferenceName is a unique identifier for each task instance within the workflow.
  • type: Conductor supports various task types:
    • SIMPLE: A regular task that your application code executes.
    • FORK/JOIN: For parallel execution of sub-workflows.
    • DECISION: For conditional branching.
    • SUB_WORKFLOW: To embed one workflow within another.
    • HUMAN: For tasks requiring human intervention.
    • EVENT: To wait for external events.
  • inputParameters: Defines the data passed to the task. Notice the ${workflow.input.fieldName} syntax, which allows you to dynamically pull data from the overall workflow input or the output of preceding tasks.
  • dependsOn: Crucially, this specifies the preceding tasks that must complete successfully before this task can start. This creates the dependency graph.
  • startDelay: A simple delay before a task starts.
  • optional: If true, the workflow can continue even if this task fails.

When a workflow is triggered, Conductor’s server keeps track of its state. Your worker applications poll Conductor for tasks assigned to them. Once a worker completes a task, it reports back to Conductor with the result, which then updates the workflow’s state and potentially triggers the next task.

The core problem Conductor solves is managing the complexity and unreliability of distributed systems. Microservices are great, but coordinating them across a network, handling timeouts, retries, and ensuring eventual consistency is hard. Conductor provides a centralized, observable, and resilient way to do this. It allows you to define your business logic as a visual or declarative flow, abstracting away much of the underlying coordination logic.

Internally, Conductor uses a database (typically MySQL or PostgreSQL) to store workflow state. It has a REST API for triggering workflows, polling for tasks, and reporting task results. Worker services are external applications that implement the actual business logic for each task type. Conductor can also integrate with message queues (like Kafka or SQS) for task distribution and eventing.

The levers you control are primarily in the workflow definition:

  • Task dependencies: How tasks relate to each other.
  • Task types: Choosing the right mechanism for execution (simple, parallel, decision, etc.).
  • Input/Output mapping: How data flows between tasks and the workflow.
  • Error handling: Defining retry policies, timeouts, and fallback tasks.
  • Parameters: Configuring specific values for tasks.

One powerful feature is its ability to handle long-running, stateful operations. Unlike a simple HTTP call that might time out, a Conductor workflow can be paused indefinitely, waiting for an external event or human input, and then resumed without losing its place. This is achieved by storing the entire workflow state in the database, so even if the Conductor server restarts, it can pick up exactly where it left off.

The explicit dependsOn field in task definitions is how you build the directed acyclic graph (DAG) of your workflow. When a task completes successfully, Conductor checks which tasks have dependsOn pointing to it and are now ready to run based on all their dependencies being met.

The next major challenge you’ll encounter is managing complex error handling and retry strategies across multiple tasks, especially when dealing with external dependencies that have their own failure modes.

Want structured learning?

Take the full Saga-pattern course →