The Rancher API gateway is refusing to validate authentication tokens because their validity period has expired, preventing clients from authenticating.

Common Causes and Fixes for Rancher Token Expiry

  1. Expired rancher.auth.token-max-ttl Setting:

    • Diagnosis: Check the Rancher server configuration for the maximum token Time-To-Live. This is typically set via Helm values or environment variables.
      helm get values -n cattle-system <rancher-release-name> -o yaml | grep token-max-ttl
      
      Or if running directly:
      kubectl describe pods -n cattle-system <rancher-pod-name> | grep TOKEN_MAX_TTL
      
    • Fix: Increase the token-max-ttl value. For example, to set it to 24 hours (86400 seconds):
      # In your rancher-values.yaml or directly in helm upgrade command
      rancher:
        auth:
          tokenMaxTtl: 86400
      
      Then apply the changes:
      helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f rancher-values.yaml
      
    • Why it works: This directly extends the maximum allowable lifespan for any generated authentication token on the Rancher server.
  2. Expired rancher.auth.session-cookie-max-age-seconds Setting:

    • Diagnosis: Similar to token-max-ttl, check the session cookie’s maximum age. This affects how long the browser remembers a successful login.
      helm get values -n cattle-system <rancher-release-name> -o yaml | grep session-cookie-max-age-seconds
      
    • Fix: Increase session-cookie-max-age-seconds. For instance, to set it to 24 hours:
      # In your rancher-values.yaml or directly in helm upgrade command
      rancher:
        auth:
          sessionCookieMaxAgeSeconds: 86400
      
      Then apply the changes:
      helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f rancher-values.yaml
      
    • Why it works: This ensures that the HTTP cookie used by the browser to maintain the authenticated session remains valid for a longer period, preventing premature logouts.
  3. Expired rancher.auth.kubeconfig-kubeconfig-max-ttl Setting:

    • Diagnosis: This setting controls the expiration of tokens embedded within generated kubeconfig files for cluster access.
      helm get values -n cattle-system <rancher-release-name> -o yaml | grep kubeconfig-kubeconfig-max-ttl
      
    • Fix: Increase kubeconfig-kubeconfig-max-ttl. For example, to set it to 48 hours (172800 seconds):
      # In your rancher-values.yaml or directly in helm upgrade command
      rancher:
        auth:
          kubeconfigKubeconfigMaxTtl: 172800
      
      Then apply the changes:
      helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f rancher-values.yaml
      
    • Why it works: Generated kubeconfig files will contain tokens that are valid for a longer duration, allowing users to maintain cluster access without frequent re-authentication.
  4. System Clock Skew:

    • Diagnosis: Verify that the system clocks on your Rancher server nodes and the client machines are synchronized. Significant clock drift can cause tokens to appear expired prematurely.
      # On Rancher server node
      date
      
      # On client machine
      date
      
      Compare the outputs. If there’s a difference of more than a few minutes, it’s a problem.
    • Fix: Configure your servers and clients to use NTP (Network Time Protocol) to synchronize their clocks. Ensure NTP is enabled and running on all relevant systems.
      # Example for Ubuntu/Debian
      sudo apt update
      sudo apt install ntp
      sudo systemctl enable ntp
      sudo systemctl start ntp
      
      # Example for CentOS/RHEL
      sudo yum install ntp
      sudo systemctl enable ntpd
      sudo systemctl start ntpd
      
    • Why it works: Synchronized clocks ensure that token validity periods are interpreted consistently across all systems involved in authentication.
  5. Rancher Server Pod Restart:

    • Diagnosis: If Rancher was recently restarted or upgraded without proper configuration persistence, default or old token-max-ttl values might have been re-applied. Check the Rancher pod logs for any signs of unexpected restarts or configuration loading.
      kubectl logs -n cattle-system <rancher-pod-name>
      
    • Fix: Re-apply the desired token-max-ttl and related settings using helm upgrade or by updating the deployment manifest. Ensure your Helm values.yaml file is checked into version control.
      helm upgrade -n cattle-system <rancher-release-name> rancher/rancher -f path/to/your/rancher-values.yaml
      
    • Why it works: This ensures that the persistent configuration, including the extended token expiry settings, is correctly loaded and applied to the running Rancher API server.
  6. External Authentication Provider (e.g., AD/LDAP, SAML) Token Expiry:

    • Diagnosis: If Rancher is configured to use an external authentication provider, the expiry of tokens issued by that provider can also cause issues. Check the configuration and logs of your external identity provider.
    • Fix: Adjust the session timeout or token lifetime settings within your external authentication provider’s configuration. For example, in Active Directory, this might involve adjusting Kerberos ticket lifetimes or relevant GPOs. For SAML, check the IdP’s session duration settings.
    • Why it works: This ensures that the authentication tokens generated by your external IdP remain valid for a duration that aligns with Rancher’s expectations and user needs.

After resolving token expiry, you’ll likely encounter connection refused errors if the Rancher API server itself is unhealthy or inaccessible.

Want structured learning?

Take the full Rancher course →