Skaffold’s pod tolerations and affinity settings aren’t about selecting nodes, but about allowing pods to be scheduled onto nodes that might otherwise be undesirable.
Let’s see this in action. Imagine you have a Kubernetes cluster with different types of nodes: some are general-purpose, others are specialized (e.g., GPU nodes, or nodes with specific hardware). You want your development pods to land on the general-purpose nodes, not the specialized ones, unless absolutely necessary.
Here’s a snippet from a skaffold.yaml:
apiVersion: skaffold/v2beta29
kind: Config
deploy:
kubectl:
manifests:
- k8s/*.yaml
profiles:
- name: dev
manifests:
- k8s/dev-deployment.yaml
deploy:
kubectl:
# This is where the magic happens for dev node selection
# We're telling Kubernetes to prefer nodes with the label 'nodegroup=dev'
# and to avoid nodes with the label 'special=gpu'
# If no suitable node is found, it will tolerate nodes with 'special=gpu'
# but still *prefer* nodes matching the affinity rule.
args:
loadBalancer:
# This is a simplified example; actual affinity/toleration is defined
# in the Kubernetes Deployment manifest itself. Skaffold *applies*
# these manifests. The example below shows how you might *influence*
# it, but the core Kubernetes scheduling logic is in the pod spec.
# The real power is in the pod spec you deploy.
# Let's assume your k8s/dev-deployment.yaml contains:
# ...
# spec:
# affinity:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: nodegroup
# operator: In
# values:
# - dev
# preferredDuringSchedulingIgnoredDuringExecution:
# - weight: 1
# preference:
# matchExpressions:
# - key: another-label
# operator: In
# values:
# - some-value
# tolerations:
# - key: "special"
# operator: "Equal"
# value: "gpu"
# effect: "NoSchedule"
# ...
# Skaffold itself doesn't *directly* set affinity/tolerations
# in its config for pod scheduling. Instead, you define these
# in your Kubernetes manifests, and Skaffold deploys them.
# The profile can be used to *select which manifests to deploy*,
# or to apply *patches* to existing manifests.
# For direct influence on scheduling via Skaffold, you'd often
# use patches or custom templates.
# A common pattern is to have different manifest files for
# different environments, selected by Skaffold profiles.
# Example of how a profile might *select* different manifests:
# If you had k8s/dev-deployment.yaml and k8s/prod-deployment.yaml,
# and k8s/dev-deployment.yaml had the affinity/tolerations,
# then activating the 'dev' profile would deploy that specific manifest.
# If you *must* influence it directly via Skaffold args (less common for core scheduling):
# You'd typically use a templating engine like Helm or Kustomize
# and Skaffold would invoke that.
# For raw kubectl, you'd rely on the manifests themselves.
# Let's pivot to the *Kubernetes manifest* side, as that's where
# the actual scheduling rules live. Skaffold just deploys them.
# Consider your Deployment manifest (e.g., k8s/dev-deployment.yaml):
# ...
# spec:
# template:
# spec:
# affinity:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: kubernetes.io/os
# operator: In
# values:
# - linux
# preferredDuringSchedulingIgnoredDuringExecution:
# - weight: 100
# preference:
# matchExpressions:
# - key: nodegroup
# operator: In
# values:
# - dev-pool # Prefer nodes in the 'dev-pool' node group
# tolerations:
# - key: "dedicated"
# operator: "Exists"
# effect: "NoSchedule" # Tolerate nodes with the 'dedicated' taint
# ...
The mental model here is that Kubernetes’ scheduler is the ultimate arbiter of where pods land. Skaffold’s role is to apply the configuration that tells the scheduler what to do.
What problem does this solve?
In a multi-tenant or mixed-workload Kubernetes cluster, you often have nodes with different capabilities or costs. You might have expensive GPU nodes or nodes reserved for critical system services. You don’t want your transient development pods consuming these resources or being scheduled onto them by default. Affinity and tolerations allow you to guide the scheduler to place pods on the right nodes.
- Node Affinity: This is like a "preferred seating" or "must sit here" rule for pods.
requiredDuringSchedulingIgnoredDuringExecution: The pod will not be scheduled if no node matches these rules. This is a hard requirement.preferredDuringSchedulingIgnoredDuringExecution: The scheduler will try to place the pod on nodes matching these rules, but if it can’t, it will still schedule it elsewhere (as long as other constraints allow). This is a soft preference.
- Tolerations: These are applied to pods and tell the scheduler, "It’s okay if this node has a taint that would normally repel me." Taints are applied to nodes to repel pods.
key,operator,value,effect: These define the taint you’re willing to tolerate.NoSchedulemeans the pod won’t be scheduled there;PreferNoScheduleis a soft preference;NoExecuteevicts pods already running there.
How it works internally (Kubernetes Scheduler):
When a pod is created, it enters the Pending state. The Kubernetes scheduler watches for pending pods. For each pod, it:
- Filters: It looks at all nodes in the cluster and filters out any that cannot run the pod. This includes nodes that don’t meet
nodeSelectororrequiredDuringSchedulingnodeAffinityrules, or nodes with taints that the pod doesn’t tolerate. - Scores: For the remaining nodes, it assigns a score based on
preferredDuringSchedulingnodeAffinityrules and other scoring plugins (like resource utilization). Nodes that better match preferences get higher scores. - Selects: It picks the node with the highest score. If multiple nodes have the same highest score, it chooses one arbitrarily.
- Binds: It "binds" the pod to the selected node, meaning the pod is now scheduled.
The Exact Levers You Control (in your Kubernetes manifests):
nodeSelector(in Pod Spec): A simple key-value pair. The pod will only be scheduled on nodes that have all of these labels.spec: nodeSelector: disktype: ssdnodeAffinity(in Pod Spec): More powerful. Can be required or preferred, and supports more complex match expressions (e.g.,In,NotIn,Exists,DoesNotExist).spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/arch operator: In values: - amd64 - arm64 preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchFields: - key: metadata.name operator: In values: - node-1 - node-2tolerations(in Pod Spec): Lets pods run on nodes with matching taints.spec: tolerations: - key: "key1" operator: "Equal" value: "value1" effect: "NoSchedule" - key: "key2" operator: "Exists" effect: "NoExecute"
The one thing most people don’t know:
While requiredDuringSchedulingIgnoredDuringExecution node affinity is a hard requirement for scheduling, the preferredDuringSchedulingIgnoredDuringExecution affinity rules are evaluated after the NoSchedule or NoExecute taints have been considered. This means a preferred rule might influence the choice between two nodes that are both already "tolerable" by the pod, but it won’t override a hard NoSchedule taint if the pod doesn’t also tolerate it.
The next concept you’ll likely encounter is pod anti-affinity, which is the inverse of node affinity and helps distribute pods across nodes or availability zones to improve resilience.