Ray Custom Resources: Schedule on GPUs and Accelerators (2026)

Ray’s custom resources let you schedule arbitrary hardware, not just CPUs and GPUs, onto your Ray cluster.

Let’s see it in action. Imagine you have a specialized hardware accelerator, say a "TPU-v3" accelerator, that you want Ray to manage. You’d first define this resource on the nodes that have it.

import ray

# Assuming you have a node with 2 TPU-v3 accelerators
# We'll define this resource when starting Ray or adding a node
# ray.init(num_cpus=4, resources={"TPU-v3": 2})

Now, when you want to run a Ray task or actor that requires these accelerators, you specify it in the resources argument.

@ray.remote(resources={"TPU-v3": 1})
def train_on_tpu():
    # This function will only run on a node that has at least 1 "TPU-v3" resource available.
    print("Training on TPU-v3!")
    return "Training complete"

# Launch the task
result = train_on_tpu.remote()
print(ray.get(result))

This works for actors too:

@ray.remote(resources={"TPU-v3": 2})
class TPUModel:
    def __init__(self):
        print("TPUModel initialized, using 2 TPU-v3 resources.")

    def predict(self, data):
        print("Predicting using TPU-v3...")
        return "Prediction done"

# Instantiate the actor
model = TPUModel.remote()
prediction_result = ray.get(model.predict("sample_data"))
print(prediction_result)

Ray’s scheduler, when given a task or actor that requests custom resources, will look for nodes in the cluster that advertise those resources and have the requested quantity available. If a node has 4 "TPU-v3" resources and a task requests 2, that node can still fulfill other requests for 1 or 2 "TPU-v3" resources until its available count drops to zero. This is managed by Ray’s internal actor and task scheduling logic, which tracks resource availability dynamically.

The core problem Ray custom resources solve is heterogeneous hardware environments. In deep learning, you might have a mix of GPUs (like NVIDIA A100s and V100s), TPUs, FPGAs, or even specialized ASICs. Without a unified way to declare and request these, you’d have to manually partition your cluster or use complex external schedulers. Ray’s approach allows you to treat these as first-class citizens alongside CPUs and standard GPUs. The ray.init() call is where you declare the total custom resources available on that specific node when it joins the cluster. Subsequent calls to .remote() for tasks or actors then request a portion of those declared resources.

When you define a custom resource like "TPU-v3": 2 during ray.init(), you are essentially telling the Ray head node (or the node itself if it’s not the head) that this particular machine possesses two units of a resource named "TPU-v3". These units are abstract. Ray doesn’t know what a "TPU-v3" is intrinsically; it only knows it’s a countable quantity that can be allocated. When a task or actor is launched with resources={"TPU-v3": 1}, the scheduler finds a node with at least one "TPU-v3" available, decrements that node’s available count for "TPU-v3" by one, and schedules the task/actor there. When the task/actor finishes, the resource is released and the count is incremented back. This abstraction allows Ray to manage any type of hardware uniformly, as long as you can assign a name and a quantity to it.

If you’re using NVIDIA GPUs, Ray automatically detects them and exposes them as GPU resources. You can request them like resources={"GPU": 1}. However, when you need to specify which GPU or a specific type of GPU (e.g., NVIDIA-V100), you must define them as custom resources. This is particularly useful if you have a mix of GPU types and want to ensure a workload runs on the more powerful ones, or if you have non-GPU accelerators.

The most surprising thing about custom resources is that their names are entirely arbitrary strings. You could define a resource called "MySpecialChip" or "SuperFastFPGA" and Ray would manage it exactly the same way it manages "TPU-v3" or "GPU". The key is consistency: the name used in ray.init() to declare the resource must match the name used in the resources argument of @ray.remote or .remote().

This mechanism is also how you’d manage non-GPU accelerators. If you have a cluster with FPGAs, you could initialize a node with ray.init(resources={"FPGA": 4}) and then schedule tasks requiring them using my_task.options(resources={"FPGA": 1}).remote(). The underlying implementation relies on Ray’s distributed actor management and resource tracking system, which maintains a global view of available resources across all connected nodes.

The next concept to explore is how Ray handles resource contention and fractional resource allocation.