A comprehensive guide for new developers to understand and contribute to Rancher
- Introduction
- High-Level Architecture
- Project Structure
- Entry Points
- Core Packages
- API Layer
- Authentication & Authorization
- Kubernetes Integration
- Multi-Cluster Architecture
- Controllers
- Data Flow
- Key Concepts
Rancher is a complete container management platform that provides a centralized control plane for managing multiple Kubernetes clusters. The codebase is written in Go and follows Kubernetes-native patterns, using Custom Resource Definitions (CRDs) as its primary state storage mechanism and controllers (reconcilers) for business logic.
The system consists of two main components: the Management Server (this repository's main binary) and Cluster Agents that run inside managed clusters. The management server hosts the Rancher UI/API, runs controllers over Rancher CRDs, and coordinates operations across all managed clusters.
┌─────────────────────────────────────────────────────────────────────────────┐
│ RANCHER MANAGEMENT SERVER │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ HTTP Handler Stack │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Auth │ │ RBAC │ │ Routing │ │ Steve/ │ │ │
│ │ │ Middleware │──│ Handler │──│ Middleware │──│ Norman API │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Controllers │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Auth │ │ Cluster │ │ Project │ │ RBAC │ ... │ │
│ │ │ Ctrl │ │ Ctrl │ │ Ctrl │ │ Ctrl │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Kubernetes API (Local Cluster) │ │
│ │ CRDs + Native Resources │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
▲ ▲
│ Tunnel │ Tunnel
│ (remotedialer) │ (remotedialer)
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────┐
│ DOWNSTREAM CLUSTER A │ │ DOWNSTREAM CLUSTER B │
│ │ │ │
│ ┌───────────────────┐ │ │ ┌───────────────────┐ │
│ │ Cluster Agent │ │ │ │ Cluster Agent │ │
│ └───────────────────┘ │ │ └───────────────────┘ │
│ │ │ │
│ ┌───────────────────┐ │ │ ┌───────────────────┐ │
│ │ K8s API Server │ │ │ │ K8s API Server │ │
│ └───────────────────┘ │ │ └───────────────────┘ │
└─────────────────────────┘ └─────────────────────────┘
The management server maintains persistent connections (tunnels) to each downstream cluster's agent. These tunnels enable the server to proxy Kubernetes API requests and execute operations on behalf of users.
The Rancher repository is organized into distinct directories, each serving a specific purpose:
rancher/
├── main.go # Entry point for the Rancher server binary
├── cmd/
│ └── agent/ # Entry point for the cluster agent binary
├── pkg/
│ ├── rancher/ # Core server construction and startup
│ ├── multiclustermanager/ # HTTP routing and multi-cluster coordination
│ ├── api/ # API implementations (Norman v3 + Steve v1)
│ ├── apis/ # CRD Go types (management, provisioning, etc.)
│ ├── schemas/ # API schema definitions
│ ├── auth/ # Authentication and authorization
│ ├── controllers/ # All reconciliation controllers
│ ├── clustermanager/ # Per-cluster client management
│ ├── k8sproxy/ # Kubernetes API proxy handler
│ ├── tunnelserver/ # Agent tunnel termination
│ ├── wrangler/ # Controller context and factories
│ ├── types/ # Configuration contexts
│ ├── generated/ # Generated clients and informers
│ ├── features/ # Feature flag framework
│ ├── settings/ # Cluster-stored configuration
│ └── ... # Additional supporting packages
├── chart/ # Helm chart for Rancher deployment
├── scripts/ # Build and development scripts
└── tests/ # Integration and e2e tests
The server entry point at main.go follows a straightforward flow. The main() function begins by registering special commands for password reset and ensuring default admin users. It then sets up the CLI application with various flags for configuration such as ports, logging, audit settings, and feature flags.
The actual startup happens in the run() function starting at line 212:
main.go:run()
│
├── k8s.GetConfig() # Get Kubernetes client config
│ (embedded vs external cluster)
│
├── rancher.New() # Construct the Rancher server
│ │
│ ├── wrangler.NewPrimaryContext() # Build controller factories
│ ├── crds.EnsureRequired() # Ensure CRDs exist
│ ├── features.InitializeFeatures() # Initialize feature flags
│ ├── auth.NewServer() # Create auth server
│ └── Build HTTP handler stack # Assemble middleware chain
│
└── server.ListenAndServe() # Start HTTPS server
The rancher.New() function in pkg/rancher/rancher.go performs the heavy lifting of assembling all components. It creates a Rancher struct that contains the authentication middleware, HTTP handler, and wrangler context.
The agent binary runs inside each downstream cluster. Its primary responsibility is establishing and maintaining a tunnel connection back to the Rancher server. When started, the agent:
- Fetches cluster parameters and server URL
- Verifies TLS/CA connectivity to the Rancher server
- Opens a persistent tunnel using
remotedialer - Authenticates via tunnel headers (
X-API-Tunnel-Token,X-API-Tunnel-Params)
This package contains the core server construction logic. The Rancher struct defined in pkg/rancher/rancher.go holds:
Auth: The authentication middleware chainHandler: The main HTTP handler stackWrangler: The controller context with clients and factoriesSteve: The Steve API server instance
The New() function constructs this struct by:
- Setting up the REST config and validating connectivity
- Running encryption config migrations
- Creating the wrangler context with shared controller factories
- Ensuring all required CRDs are installed
- Initializing features and auth server
- Building the HTTP handler middleware chain
The wrangler package provides the Context struct that aggregates all typed Kubernetes clients, shared controller factories, and informers. This context is passed throughout the codebase to provide access to cluster resources.
This package defines the HTTP routing for most external endpoints. The router() function in pkg/multiclustermanager/routes.go builds several routers that are chained together:
┌─────────────────┐
│ unauthed │ Unauthenticated endpoints
│ router │ /v3/connect, /v3/settings/cacerts, etc.
└────────┬────────┘
│ NotFoundHandler
▼
┌─────────────────┐
│ saauthed │ Service account authenticated
│ router │ /k8s/clusters/{clusterID}
└────────┬────────┘
│ NotFoundHandler
▼
┌─────────────────┐
│ authed │ User authenticated endpoints
│ router │ /v3/*, /meta/*, etc.
└────────┬────────┘
│ NotFoundHandler
▼
┌─────────────────┐
│ metricsAuthed │ Metrics endpoint
│ router │ /metrics
└────────┬────────┘
│ NotFoundHandler
▼
┌─────────────────┐
│ next handler │ Steve/inner handlers
└─────────────────┘
Contains configuration contexts like ScaledContext and ManagementContext that aggregate typed clients and access control handlers. These contexts are passed to controllers and API handlers.
Provides cluster-stored configuration settings backed by CRDs. Settings can be read and modified at runtime.
Implements the feature flag framework. Features can be enabled/disabled at startup or dynamically.
Rancher exposes two API systems that evolved over time:
The legacy management API served at /v3. It uses a schema-driven approach where API resources are defined in pkg/schemas/management.cattle.io/v3/schema.go.
Norman schemas define:
- Resource types and their fields
- Collection and resource methods (GET, POST, PUT, DELETE)
- Actions that can be performed on resources
- Input/output types for actions
The Norman API server is created in pkg/api/norman/server/server.go which registers all schemas and sets up stores for CRUD operations.
The newer Kubernetes-style API served at /v1. Steve directly exposes Kubernetes resources with minimal transformation, providing a more native experience. It is built on the external rancher/steve library.
Unauthenticated Endpoints:
GET /- Management API root (non-browser)GET /v3/settings/cacerts- CA certificatesGET /v3/settings/first-login- First login statusGET /v3/settings/ui-*- UI settingsPOST /v3/connect- Agent tunnel connectionGET /v3/import/{token}_{clusterId}.yaml- Cluster import manifestGET /rancherversion- Rancher version info/v1-{prefix}-release/*- Channel server/v1-saml/*- SAML authentication flows/v1-public/*- Public auth endpoints (login, providers)/v3-public/*- Legacy public auth endpoints (deprecated)
Service Account Authenticated:
/k8s/clusters/{clusterID}/*- Kubernetes API proxy to downstream clusters
User Authenticated:
/v3/*- Full management API surface/v3/identit*- Identity endpoints/v3/token*- Token management/meta/aks.*,/meta/gke.*, etc. - Cloud provider metadata/meta/proxy- HTTP proxy for external services/v3/tokenreview- Token review webhook/v1/logout- Logout endpoint
Metrics:
/metrics- Prometheus metrics (protected by TokenReview)
Authentication is centralized in pkg/auth. The Server struct in pkg/auth/server.go provides:
Authenticator: Middleware that extracts and validates tokens from requestsManagement: Middleware that injects auth API routes
┌─────────────────────────────────────┐
│ Incoming Request │
└────────────────┬────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ requests.Authenticator │
│ Extract token from: │
│ - Authorization header (Bearer) │
│ - R_SESS cookie │
└────────────────┬────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Token Validation │
│ - Check token exists in CRD │
│ - Verify not expired │
│ - Validate signature │
└────────────────┬────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ User Info Attached │
│ - User ID │
│ - Groups │
│ - Provider info │
└─────────────────────────────────────┘
Rancher supports multiple authentication providers, each implemented in pkg/auth/providers:
| Provider | Type | Login Input |
|---|---|---|
| Local | Username/Password | BasicLogin |
| GitHub | OAuth | GithubLogin (code) |
| GitHub App | OAuth | GithubLogin (code) |
| Active Directory | LDAP | BasicLogin |
| Azure AD | OAuth/OIDC | AzureADLogin (code) |
| OpenLDAP | LDAP | BasicLogin |
| FreeIPA | LDAP | BasicLogin |
| Ping Identity | SAML | SamlLoginInput |
| ADFS | SAML | SamlLoginInput |
| KeyCloak (SAML) | SAML | SamlLoginInput |
| OKTA | SAML | SamlLoginInput |
| Shibboleth | SAML | SamlLoginInput |
| Google OAuth | OAuth | GoogleOauthLogin (code) |
| Generic OIDC | OIDC | OIDCLogin (code) |
| KeyCloak OIDC | OIDC | OIDCLogin (code) |
| AWS Cognito | OIDC | OIDCLogin (code) |
Tokens are managed in pkg/auth/tokens. The Token CRD stores authentication tokens with the following key fields:
token: The hashed token valueuserID: Reference to the userauthProvider: Which provider authenticated the userttlMillis: Token time-to-liveisDerived: Whether this is a derived (API) tokenclusterName: Optional cluster scope
Rancher extends Kubernetes RBAC with its own role model:
- GlobalRoles: Apply to the local cluster and/or all downstream clusters
- RoleTemplates: Templates for creating Kubernetes Roles/ClusterRoles
- ClusterRoleTemplateBindings: Bind users to roles within a cluster
- ProjectRoleTemplateBindings: Bind users to roles within a project
The RBAC middleware in pkg/rbac enforces permissions based on these bindings.
At startup, k8s.GetConfig determines how to connect to Kubernetes:
- Embedded Mode: Rancher runs its own API server (Docker installation)
- External Mode: Rancher connects to an existing cluster (Helm installation)
- Auto Mode: Automatically determines based on environment
Custom Resource Definitions are defined in pkg/apis. Each API group has its own subdirectory:
pkg/apis/
├── management.cattle.io/v3/ # Core management resources
│ ├── authn_types.go # Token, User, Group, AuthConfig
│ ├── authz_types.go # Project, GlobalRole, RoleTemplate
│ ├── cluster_types.go # Cluster, ClusterRegistrationToken
│ └── ...
├── provisioning.cattle.io/v1/ # RKE2/K3s provisioning
│ └── cluster_types.go # Provisioning Cluster, MachinePools
├── project.cattle.io/v3/ # Project-scoped resources
├── catalog.cattle.io/v1/ # Helm catalog resources
└── rke.cattle.io/v1/ # RKE2 configuration
CRDs are installed at startup in pkg/crds and pkg/crds/dashboard.
The wrangler context created by wrangler.NewPrimaryContext() provides:
- Typed Kubernetes clients (core, apps, RBAC, etc.)
- Typed Rancher clients (management, provisioning, etc.)
- Shared informer factories
- Controller factories
Rancher manages several types of clusters:
- Local Cluster: The cluster where Rancher runs
- Imported Clusters: Existing clusters registered with Rancher
- Provisioned Clusters: Clusters created by Rancher (RKE2/K3s)
- Hosted Clusters: EKS, AKS, GKE managed by cloud providers
┌──────────────────────────────────────────────────────────────┐
│ RANCHER SERVER │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Tunnel Server │ │
│ │ pkg/tunnelserver │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Authorizer │ │ Authorizer │ │ Authorizer │ │ │
│ │ │ (cluster-a) │ │ (cluster-b) │ │ (cluster-c) │ │ │
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │
│ │ │ │ │ │ │
│ │ │ remotedialer │ │ │ │
│ │ │ connections │ │ │ │
│ └─────────┼─────────────────┼─────────────────┼────────────┘ │
│ │ │ │ │
└────────────┼─────────────────┼─────────────────┼──────────────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ Cluster Agent │ │ Cluster Agent │ │ Cluster Agent │
│ (cluster-a) │ │ (cluster-b) │ │ (cluster-c) │
└────────────────┘ └────────────────┘ └────────────────┘
Agents authenticate using tunnel tokens and transmit cluster identity via headers. The server uses these tunnels to:
- Proxy Kubernetes API requests
- Execute controller logic remotely
- Deploy workloads and configurations
The proxy in pkg/k8sproxy handles requests to /k8s/clusters/{clusterID}/*. It:
- Extracts the cluster ID from the URL
- Looks up the appropriate tunnel connection
- Forwards the request through the tunnel
- Returns the response to the client
Controllers (reconcilers) implement the business logic for Rancher resources. They are organized in pkg/controllers:
pkg/controllers/
├── auditlog/ # Audit logging
├── capr/ # CAPI/RKE2 provisioning
├── dashboard/ # Dashboard/UI integration
├── dashboardapi/ # Dashboard API resources
├── management/ # Core management controllers
│ └── auth/ # Authentication controllers
├── managementagent/ # Agent-side management
├── managementapi/ # API-related controllers
├── managementlegacy/ # Legacy management controllers
├── managementuser/ # User-scoped controllers
├── nodedriver/ # Node driver controllers
├── provisioningv2/ # RKE2/K3s provisioning
│ └── cluster/ # Cluster provisioning
└── status/ # Status aggregation
Controllers follow the standard Kubernetes controller pattern:
// Handler interface pattern
type Handler interface {
OnChange(key string, obj *v3.SomeResource) (*v3.SomeResource, error)
}
// Registration
controller.Register(ctx, "controller-name", factory, handler.OnChange)Key controllers include:
- Auth Controllers: User, token, and provider management
- Cluster Controllers: Cluster lifecycle and status
- Project Controllers: Project creation and RBAC
- Provisioning Controllers: Machine pool and RKE2 setup
1. User submits credentials to /v1-public/login
│
▼
2. PublicAPI handler (pkg/auth/providers/publicapi/login.go)
- Unmarshal request based on provider type
- Call provider-specific authentication
│
▼
3. Provider authentication (pkg/auth/providers/*)
- Validate credentials with external system
- Return user principal and group principals
│
▼
4. User management (pkg/user)
- Ensure user exists in local database
- Update user attributes
│
▼
5. Token creation (pkg/auth/tokens)
- Generate new token
- Store token as CRD
│
▼
6. Response
- Return token to client (bearer or cookie)
1. User creates provisioning.cattle.io/v1.Cluster
│
▼
2. Provisioning controller (pkg/controllers/provisioningv2/cluster)
- Validate spec
- Create management.cattle.io/v3.Cluster
│
▼
3. Cluster controller
- Create ClusterRegistrationToken
- Generate agent manifests
│
▼
4. Machine pool controllers
- Create machine templates
- Provision infrastructure
│
▼
5. Agent deployment
- Agent connects via tunnel
- Cluster becomes Ready
┌──────────────┐
│ Request │
└──────┬───────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ SetXAPICattleAuthHeader │
│ Set X-API-Cattle-Auth header for UI │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ WebSocket Handler │
│ Handle WebSocket upgrades │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Cluster Proxy │
│ Route /k8s/clusters/* to downstream │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ MultiClusterManager Middleware │
│ Route to management API or pass through │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Auth Server Management │
│ Handle auth-specific routes │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Authentication Filter │
│ Require authenticated user for protected paths │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Steve API Server │
│ Handle /v1 Kubernetes-style API │
└──────────────────────────────────────────────────────────┘
Principals represent identities in Rancher. They are formatted as {provider}_{type}://{identifier}:
local://u-{id}- Local usergithub_user://{id}- GitHub useractivedirectory_user://CN=...- AD userlocal://g-{id}- Local group
Projects group namespaces within a cluster. They provide:
- Multi-tenancy within a cluster
- Shared resource quotas
- Project-scoped RBAC
A project's name format is {clusterName}:{projectName}.
Fleet workspaces organize clusters for GitOps deployments. Each workspace can have its own GitRepos and Bundles.
Resources use conditions to communicate status. Common condition types:
Ready- Resource is fully operationalProvisioned- Underlying resources createdAgentDeployed- Cluster agent is runningUpdated- Desired state matches actual state
# Full build
make
# Just the rancher binary
go build -o bin/rancher .
# Agent binary
go build -o bin/agent ./cmd/agent# With existing kubeconfig
./bin/rancher --kubeconfig=$KUBECONFIG
# Development mode
CATTLE_DEV_MODE=1 ./bin/rancher| Variable | Description |
|---|---|
KUBECONFIG |
Path to kubeconfig file |
CATTLE_DEV_MODE |
Enable development mode |
CATTLE_FEATURES |
Feature flag overrides |
AUDIT_LEVEL |
Audit logging level (0-3) |
- Define your handler in
pkg/controllers/<area>/ - Register it in the appropriate
Register()function - The handler will be called when watched resources change
For Norman API:
- Add types to
pkg/apis/management.cattle.io/v3/ - Add schema registration in
pkg/schemas/management.cattle.io/v3/schema.go - Implement custom handlers if needed
For Steve API:
- Resources are automatically exposed based on CRDs
- Custom logic via schema stores and handlers