RKE2, Rancher’s next-generation Kubernetes distribution, is designed for production environments, prioritizing security and stability.

Here’s a look at RKE2 in action, specifically focusing on a production setup managed by Rancher.

Imagine you have a set of bare-metal servers or cloud VMs ready to become your Kubernetes cluster. Instead of manually configuring kubeadm or other complex setups, Rancher simplifies this by providing a UI-driven approach to deploying and managing RKE2 clusters.

Let’s say you’ve logged into your Rancher instance. You navigate to "Cluster Management," then "Create." You select "RKE2/K3s" as your cluster type. Rancher then presents you with a series of configuration options.

You’ll define your cluster’s name, the Kubernetes version you want to use (e.g., v1.27.5+rke2r1), and the network provider (e.g., Canal, which is a combination of Flannel and Calico for CNI).

For a production setup, you’ll want to configure etcd. RKE2 embeds etcd directly, and Rancher allows you to specify the etcd snapshot configuration. This might involve setting an interval (e.g., 12h for every 12 hours) and a retention (e.g., 6 for keeping the last 6 snapshots). These snapshots are crucial for disaster recovery.

You’ll also configure agent and server node pools. For servers (which run the control plane and etcd), you’ll specify the number of nodes. For agents (which run your workloads), you’ll define their count and potentially their machine pools for scaling.

Under "Advanced Options," you can fine-tune RKE2’s behavior. For instance, you might set tls-san (Subject Alternative Names) to include specific IP addresses or DNS names that your cluster will be accessible by. You could also configure kube-api-server-arg to pass specific arguments to the Kubernetes API server, such as --audit-log-path=/var/log/kube-audit/audit.log for enabling audit logging.

Rancher then generates a configuration file (typically /etc/rancher/rke2/config.yaml) on the nodes you designate as servers. This file contains all your chosen settings. For example:

write-kubeconfig-mode: "0644"
token: "MYSECURETOKEN"
tls-san:
  - "192.168.1.100"
  - "rancher.mydomain.com"
profile: "production"
etcd-snapshot-schedule-interval: "12h"
etcd-snapshot-retention: 6

Once you click "Create" in Rancher, it orchestrates the installation of RKE2 on your specified nodes. It downloads the RKE2 binaries, configures the systemd services for the agent and server components, and bootstraps the Kubernetes control plane. Rancher monitors this process, and if successful, you’ll see your new RKE2 cluster appear in the Rancher UI, ready for deploying applications.

The problem RKE2 solves is the complexity and potential for misconfiguration when setting up hardened Kubernetes clusters for production. It provides a opinionated, secure-by-default distribution that reduces the attack surface and simplifies operational overhead. Internally, RKE2 uses a single binary for its components (server and agent), which are then configured via config.yaml and managed by systemd. This monolithic approach, combined with strong defaults, makes it more robust than traditional K8s installations.

One aspect often overlooked is how RKE2 manages its node registration token. This token isn’t just a simple password; it’s a signed JWT that the RKE2 agent uses to authenticate with the RKE2 server during the bootstrapping process. If this token expires or is compromised, new nodes won’t be able to join the cluster, and existing nodes might eventually be deprovisioned. Rancher helps manage this by providing an interface to view and rotate the token, though understanding the underlying JWT mechanism is key for advanced troubleshooting.

The next step in managing a production RKE2 cluster is often setting up robust monitoring and logging solutions, such as Prometheus and Grafana, or integrating with external SIEM systems.

Want structured learning?

Take the full Rancher course →