Ray Task Graphs: Parallelism Patterns Explained (2026)

Ray’s task graph is actually a directed acyclic graph (DAG) where nodes represent tasks and edges represent dependencies, but it’s often a lot more complex and deeply nested than you might initially assume.

Let’s see it in action. Imagine we have a simple workflow: fetch some data, process it, and then aggregate the results.

import ray
import time

ray.init()

@ray.remote
def fetch_data(source_id):
    print(f"Fetching data from {source_id}...")
    time.sleep(1) # Simulate network latency
    return {"data": f"data_from_{source_id}", "source": source_id}

@ray.remote
def process_data(data_chunk):
    print(f"Processing data chunk: {data_chunk['source']}...")
    time.sleep(0.5) # Simulate CPU work
    return {"processed": data_chunk["data"].upper(), "source": data_chunk["source"]}

@ray.remote
def aggregate_results(processed_chunks):
    print("Aggregating results...")
    time.sleep(0.2)
    aggregated = {}
    for chunk in processed_chunks:
        aggregated[chunk["source"]] = chunk["processed"]
    return aggregated

# Define the workflow as a task graph
source_ids = ["A", "B", "C"]
data_fetches = [fetch_data.remote(sid) for sid in source_ids]
processed_data = [process_data.remote(fetch_result) for fetch_result in data_fetches]
final_result = aggregate_results.remote(processed_data)

# Retrieve the final result
result = ray.get(final_result)
print(f"Final aggregated result: {result}")

ray.shutdown()

When you run this, Ray doesn’t just execute these tasks sequentially. It builds a DAG. fetch_data.remote("A"), fetch_data.remote("B"), and fetch_data.remote("C") can all start immediately because they have no dependencies. Once fetch_data.remote("A") finishes, its result becomes available as an argument to process_data.remote(). Ray schedules process_data.remote() to run as soon as its input (fetch_result) is ready. This parallel execution of independent tasks and the dependency management are what the task graph orchestrates.

The core problem Ray’s task graph solves is managing complex, potentially distributed, and data-dependent computations efficiently. Instead of you manually tracking which piece of data comes from which function call and when it’s ready, Ray builds this dependency graph implicitly from your @ray.remote function calls and ray.get() or ray.put() operations. Each .remote() call returns an ObjectRef, which is a future or a handle to the result. When you pass an ObjectRef as an argument to another remote function, you’re creating an edge in the task graph. Ray’s scheduler then uses this graph to figure out the optimal execution order, exploiting all possible parallelism.

Internally, Ray maintains a TaskGraph object. This graph is composed of Task and ObjectRef objects. A Task represents a call to a remote function, and an ObjectRef represents a future result. The scheduler monitors the dependencies encoded in the graph. When all dependencies for a given task are met (i.e., all its input ObjectRefs have been resolved and their data is available), the scheduler can dispatch that task to an available worker. This allows for fine-grained parallelism, where even small, independent computations can be executed concurrently.

The power comes from composition. You can nest these task graphs. A remote function can itself launch other remote functions, creating subgraphs. Ray flattens this nested structure into a single, executable DAG. The ray.get() calls are essentially points where you might be forced to wait for a subgraph to complete, but Ray still executes everything before that ray.get() as concurrently as possible.

A subtle but powerful aspect of Ray’s task graph is its ability to handle dynamic task creation. Unlike static job scheduling, where the entire execution plan is known upfront, Ray can generate new tasks and dependencies on the fly based on the results of previously executed tasks. This makes it ideal for adaptive algorithms, reinforcement learning, or any workflow where the next steps are not fully determined until intermediate results are available. The scheduler continuously inspects the graph and dispatches tasks as soon as their dependencies are satisfied, making it highly responsive to changing computation needs.

The next frontier is understanding how Ray’s object store interacts with the task graph to minimize data movement between tasks.