The Rancher API gateway is refusing to validate authentication tokens because their validity period has expired, preventing clients from authenticating.
Common Causes and Fixes for Rancher Token Expiry
-
Expired
rancher.auth.token-max-ttlSetting:- Diagnosis: Check the Rancher server configuration for the maximum token Time-To-Live. This is typically set via Helm values or environment variables.
Or if running directly:helm get values -n cattle-system <rancher-release-name> -o yaml | grep token-max-ttlkubectl describe pods -n cattle-system <rancher-pod-name> | grep TOKEN_MAX_TTL - Fix: Increase the
token-max-ttlvalue. For example, to set it to 24 hours (86400 seconds):
Then apply the changes:# In your rancher-values.yaml or directly in helm upgrade command rancher: auth: tokenMaxTtl: 86400helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f rancher-values.yaml - Why it works: This directly extends the maximum allowable lifespan for any generated authentication token on the Rancher server.
- Diagnosis: Check the Rancher server configuration for the maximum token Time-To-Live. This is typically set via Helm values or environment variables.
-
Expired
rancher.auth.session-cookie-max-age-secondsSetting:- Diagnosis: Similar to
token-max-ttl, check the session cookie’s maximum age. This affects how long the browser remembers a successful login.helm get values -n cattle-system <rancher-release-name> -o yaml | grep session-cookie-max-age-seconds - Fix: Increase
session-cookie-max-age-seconds. For instance, to set it to 24 hours:
Then apply the changes:# In your rancher-values.yaml or directly in helm upgrade command rancher: auth: sessionCookieMaxAgeSeconds: 86400helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f rancher-values.yaml - Why it works: This ensures that the HTTP cookie used by the browser to maintain the authenticated session remains valid for a longer period, preventing premature logouts.
- Diagnosis: Similar to
-
Expired
rancher.auth.kubeconfig-kubeconfig-max-ttlSetting:- Diagnosis: This setting controls the expiration of tokens embedded within generated
kubeconfigfiles for cluster access.helm get values -n cattle-system <rancher-release-name> -o yaml | grep kubeconfig-kubeconfig-max-ttl - Fix: Increase
kubeconfig-kubeconfig-max-ttl. For example, to set it to 48 hours (172800 seconds):
Then apply the changes:# In your rancher-values.yaml or directly in helm upgrade command rancher: auth: kubeconfigKubeconfigMaxTtl: 172800helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f rancher-values.yaml - Why it works: Generated kubeconfig files will contain tokens that are valid for a longer duration, allowing users to maintain cluster access without frequent re-authentication.
- Diagnosis: This setting controls the expiration of tokens embedded within generated
-
System Clock Skew:
- Diagnosis: Verify that the system clocks on your Rancher server nodes and the client machines are synchronized. Significant clock drift can cause tokens to appear expired prematurely.
Compare the outputs. If there’s a difference of more than a few minutes, it’s a problem.# On Rancher server node date # On client machine date - Fix: Configure your servers and clients to use NTP (Network Time Protocol) to synchronize their clocks. Ensure NTP is enabled and running on all relevant systems.
# Example for Ubuntu/Debian sudo apt update sudo apt install ntp sudo systemctl enable ntp sudo systemctl start ntp # Example for CentOS/RHEL sudo yum install ntp sudo systemctl enable ntpd sudo systemctl start ntpd - Why it works: Synchronized clocks ensure that token validity periods are interpreted consistently across all systems involved in authentication.
- Diagnosis: Verify that the system clocks on your Rancher server nodes and the client machines are synchronized. Significant clock drift can cause tokens to appear expired prematurely.
-
Rancher Server Pod Restart:
- Diagnosis: If Rancher was recently restarted or upgraded without proper configuration persistence, default or old
token-max-ttlvalues might have been re-applied. Check the Rancher pod logs for any signs of unexpected restarts or configuration loading.kubectl logs -n cattle-system <rancher-pod-name> - Fix: Re-apply the desired
token-max-ttland related settings usinghelm upgradeor by updating the deployment manifest. Ensure your Helmvalues.yamlfile is checked into version control.helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f path/to/your/rancher-values.yaml - Why it works: This ensures that the persistent configuration, including the extended token expiry settings, is correctly loaded and applied to the running Rancher API server.
- Diagnosis: If Rancher was recently restarted or upgraded without proper configuration persistence, default or old
-
External Authentication Provider (e.g., AD/LDAP, SAML) Token Expiry:
- Diagnosis: If Rancher is configured to use an external authentication provider, the expiry of tokens issued by that provider can also cause issues. Check the configuration and logs of your external identity provider.
- Fix: Adjust the session timeout or token lifetime settings within your external authentication provider’s configuration. For example, in Active Directory, this might involve adjusting Kerberos ticket lifetimes or relevant GPOs. For SAML, check the IdP’s session duration settings.
- Why it works: This ensures that the authentication tokens generated by your external IdP remain valid for a duration that aligns with Rancher’s expectations and user needs.
After resolving token expiry, you’ll likely encounter connection refused errors if the Rancher API server itself is unhealthy or inaccessible.