DRCluster is a cluster-scoped custom resource that represents a managed cluster
participating in disaster recovery (DR) operations. It defines the DR characteristics
of a managed cluster, including its region, S3 profile for metadata storage, network
CIDRs for fencing operations, and the desired fencing state. The DRCluster controller
validates cluster connectivity, manages cluster fencing/unfencing operations, deploys
DR operator components, and handles maintenance modes during failover operations.
- Group:
ramendr.openshift.io - Version:
v1alpha1 - Kind:
DRCluster - Scope: Cluster
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRCluster
metadata:
name: <cluster-name>
spec:
# DRCluster specification
status:
# DRCluster observed state- Type:
string - Description: Identifies the DR group for the managed cluster. All managed clusters in the same region are considered part of a synchronous replication group for Metro DR scenarios.
- Immutable: Yes
- Validation: Cannot be changed after creation
Example:
spec:
region: "us-east-1"- Type:
string - Description: Name of the S3 profile (defined in Ramen operator configuration) used to store and restore persistent volume (PV) related cluster state during recovery or relocate actions. This S3 profile must be available for successful workload migration to this cluster. For applications active on this cluster, their PV-related state is stored to S3 profiles of all other DRClusters in the same DRPolicy.
- Immutable: Yes
- Validation: Must reference a valid S3 profile in the Ramen configuration
Example:
spec:
s3ProfileName: "s3-profile-east"- Type:
[]string - Description: List of CIDR strings representing the network ranges used or potentially used by nodes in this managed cluster. These CIDRs are used for cluster fencing operations in sync/Metro DR scenarios to block network access during failover.
- Validation: Each CIDR must be in valid format (e.g., "192.168.1.0/24")
Example:
spec:
cidrs:
- "192.168.1.0/24"
- "10.0.0.0/16"- Type:
ClusterFenceState(enum) - Description: Determines the desired fencing state of the cluster
- Valid Values:
Unfenced: Cluster is not fenced and is operationalFenced: Cluster should be fenced (network access blocked)ManuallyFenced: Cluster has been manually fenced by administratorManuallyUnfenced: Cluster has been manually unfenced by administrator
Example:
spec:
clusterFence: Unfenced- Type:
DRClusterPhase - Description: Current lifecycle phase of the DRCluster
- Possible Values:
Available: DRCluster is validated and available for useStarting: Initial reconciliation in progressFencing: Fencing operation in progressFenced: Cluster has been successfully fencedUnfencing: Unfencing operation in progressUnfenced: Cluster has been successfully unfenced
- Type:
[]metav1.Condition - Description: Standard Kubernetes conditions reflecting the current state
Condition Types:
-
Validated- Indicates whether the DRCluster has been validated
- Reasons:
Succeeded: Cluster successfully validatedInitializing: Validation in progressConfigMapGetFailed: Failed to get configurationDrClustersDeployFailed: Failed to deploy DR componentss3ConnectionFailed: S3 connection validation faileds3ListFailed: S3 list operation failed
-
Fenced- Indicates the fencing state of the cluster
- Reasons:
Fencing: Fencing operation in progressFenced: Successfully fencedUnfencing: Unfencing operation in progressUnfenced: Successfully unfencedFenceError: Fencing operation failedUnfenceError: Unfencing operation failed
-
Clean- Indicates whether NetworkFence resources exist for this cluster
- Reasons:
Clean: No fencing CRs presentFencing/Unfencing/Cleaning: Operations in progressCleanError: Cleanup operation failed
- Type:
[]ClusterMaintenanceMode - Description: List of active maintenance modes on the cluster, typically used during regional DR failover operations
ClusterMaintenanceMode Fields:
storageProvisioner(string): Type of storage provisionertargetID(string): Storage or replication instance identifierstate(MModeState): Current state of the maintenance modeconditions([]metav1.Condition): Conditions from the MaintenanceMode resource
The DRCluster controller uses annotations for storage-specific configuration:
drcluster.ramendr.openshift.io/storage-secret-name: Name of storage secretdrcluster.ramendr.openshift.io/storage-secret-namespace: Namespace of storage secretdrcluster.ramendr.openshift.io/storage-clusterid: Storage cluster identifierdrcluster.ramendr.openshift.io/storage-driver: Storage driver name (e.g., CSI driver)
The controller automatically adds the following labels:
cluster.open-cluster-management.io/backup: Set to appropriate value for OCM backup integration
drclusters.ramendr.openshift.io/ramen: Ensures proper cleanup on deletion
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRCluster
metadata:
name: cluster1
spec:
region: "east"
s3ProfileName: "s3-profile-east-1"
cidrs:
- "192.168.1.0/24"apiVersion: ramendr.openshift.io/v1alpha1
kind: DRCluster
metadata:
name: cluster2
annotations:
drcluster.ramendr.openshift.io/storage-driver: "openshift-storage.rbd.csi.ceph.com"
drcluster.ramendr.openshift.io/storage-secret-name: "rook-csi-rbd-provisioner"
drcluster.ramendr.openshift.io/storage-secret-namespace: "openshift-storage"
drcluster.ramendr.openshift.io/storage-clusterid: "openshift-storage"
spec:
region: "west"
s3ProfileName: "s3-profile-west-1"
cidrs:
- "10.0.0.0/16"
- "10.1.0.0/16"
clusterFence: UnfencedTo fence a cluster during a disaster scenario:
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRCluster
metadata:
name: cluster1
spec:
region: "east"
s3ProfileName: "s3-profile-east-1"
cidrs:
- "192.168.1.0/24"
clusterFence: Fenced # Change to FencedTo unfence a previously fenced cluster:
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRCluster
metadata:
name: cluster1
spec:
region: "east"
s3ProfileName: "s3-profile-east-1"
cidrs:
- "192.168.1.0/24"
clusterFence: Unfenced # Change to UnfencedWhen a cluster needs to be fenced (e.g., during failover):
- Admin sets
spec.clusterFencetoFenced - Controller identifies peer cluster(s) in the same region (via DRPolicy)
- Controller creates a
NetworkFenceManifestWork on the peer cluster - NetworkFence resource blocks network traffic from the fenced cluster's CIDRs
- Status transitions:
Available→Fencing→Fenced - Conditions updated to reflect fencing state
To restore a fenced cluster:
- Admin sets
spec.clusterFencetoUnfenced - Controller updates the
NetworkFenceManifestWork with unfenced state - Network traffic is restored
- Status transitions:
Fenced→Unfencing→Unfenced - NetworkFence resources are cleaned up
- Conditions updated to reflect clean state
For clusters fenced through external mechanisms:
- ManuallyFenced: Use when cluster is fenced outside of Ramen control
- ManuallyUnfenced: Use when manually unfencing an externally fenced cluster
These states allow DRCluster to track fencing state without attempting automated fencing operations.
The DRCluster controller performs the following validations:
-
S3 Profile Validation
- Verifies S3 profile exists in Ramen configuration
- Tests connectivity to S3 store
- Validates list operation on S3 bucket
-
CIDR Format Validation
- Ensures all CIDRs are in valid format
- Uses standard Go net.ParseCIDR validation
-
Region Immutability
- Prevents changes to region after creation
-
S3ProfileName Immutability
- Prevents changes to S3 profile after creation
-
Deployment Validation
- Verifies DR operator components are deployed via ManifestWork
- Checks ManifestWork applied status
DRClusters are referenced in DRPolicy resources to define disaster recovery relationships:
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRPolicy
metadata:
name: dr-policy-east-west
spec:
drClusters:
- cluster1 # References DRCluster
- cluster2
schedulingInterval: "5m"The controller automatically creates DRClusterConfig resources on managed
clusters containing:
- Cluster ID
- Replication schedules from associated DRPolicies
During fencing operations, the controller creates NetworkFence resources on
peer clusters:
apiVersion: csiaddons.openshift.io/v1alpha1
kind: NetworkFence
metadata:
name: network-fence-cluster1
spec:
driver: "openshift-storage.rbd.csi.ceph.com"
fenceState: Fenced
cidrs:
- "192.168.1.0/24"
secret:
name: rook-csi-rbd-provisioner
namespace: openshift-storage
parameters:
clusterID: "openshift-storage"During regional DR failover operations, DRCluster manages maintenance modes for storage systems:
When a DRPC (DRPlacementControl) performs failover to this cluster:
- Controller detects failover to this cluster
- Analyzes VRGs (VolumeReplicationGroups) for required storage identifiers
- Creates MaintenanceMode ManifestWorks for each storage provisioner
- Updates
status.maintenanceModeswith activation details
Status includes information about active maintenance modes:
status:
maintenanceModes:
- storageProvisioner: "openshift-storage.rbd.csi.ceph.com"
targetID: "replication-id-123"
state: Activated
conditions:
- type: Available
status: "True"
reason: ActivatedAfter failover completes:
- Controller detects no active failovers requiring maintenance mode
- Prunes inactive MaintenanceMode ManifestWorks
- Cleans up associated ManagedClusterViews
- Updates status to remove deactivated modes
When DeploymentAutomationEnabled is configured, the controller automatically:
- Creates namespace for DR cluster operator
- Deploys OLM OperatorGroup
- Creates Subscription for ramen-dr-cluster-operator
- Deploys VolSync to the managed cluster
- Creates/updates DRCluster operator ConfigMap
-
Region Design
- Use meaningful region names that reflect geographic or availability zones
- Group clusters that share storage replication in the same region
-
S3 Profile Configuration
- Ensure S3 profiles are configured before creating DRClusters
- Test S3 connectivity independently before cluster creation
- Use separate S3 buckets or prefixes for different clusters
-
CIDR Management
- Include all current and planned node network CIDRs
- Update CIDRs before adding new node networks
- Ensure CIDRs don't overlap between clusters in different regions
-
Fencing Operations
- Test fencing in non-production environments first
- Ensure peer cluster is healthy before fencing operations
- Monitor NetworkFence status on peer clusters
- Verify application failover before unfencing
-
Monitoring
- Watch
Validatedcondition for deployment issues - Monitor
phasefield for operational state - Check
maintenanceModesduring failover operations - Review conditions for error details
- Watch
Symptoms: Validated condition is False
Common Causes:
- S3 profile misconfiguration
- S3 connectivity issues
- Invalid CIDR format
- ManifestWork deployment failures
Resolution:
- Check condition reason in status
- Verify S3 profile configuration in Ramen ConfigMap
- Test S3 connectivity from hub cluster
- Validate CIDR formats
- Check ManifestWork status on managed cluster
Symptoms: Phase remains in Fencing or Unfencing
Common Causes:
- Peer cluster unreachable
- NetworkFence CRD not installed on peer cluster
- Storage driver not responding
- Invalid storage annotations
Resolution:
- Verify peer cluster is healthy
- Check NetworkFence ManifestWork status
- Verify storage annotations on DRCluster
- Check NetworkFence status on peer cluster
- Review CSI driver logs on peer cluster
Symptoms: MaintenanceModes not activating during failover
Common Causes:
- Storage identifiers not available in VRG
- MaintenanceMode ManifestWork not applied
- ManagedClusterView failures
Resolution:
- Check VRG status on source cluster
- Verify ManifestWork for MaintenanceMode
- Check ManagedClusterView for errors
- Review DRPC status for failover state
The DRCluster controller requires the following permissions:
On Hub Cluster:
- apiGroups: ["ramendr.openshift.io"]
resources: ["drclusters", "drclusters/status", "drclusters/finalizers"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["ramendr.openshift.io"]
resources: ["drplacementcontrols", "drpolicies"]
verbs: ["get", "list", "watch"]
- apiGroups: ["work.open-cluster-management.io"]
resources: ["manifestworks"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["view.open-cluster-management.io"]
resources: ["managedclusterviews"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["cluster.open-cluster-management.io"]
resources: ["managedclusters"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["secrets", "configmaps"]
verbs: ["list", "watch"]- DRPolicy CRD Documentation
- DRPlacementControl CRD Documentation
- Ramen Operator Configuration
- Cluster Fencing Guide
- Regional DR Failover
- Kubernetes: v1.21+
- Open Cluster Management: v0.9+
- CSI Addons: v0.5+ (for fencing operations)
| Version | Changes |
|---|---|
| v1alpha1 | Initial API version |
Note: This is an alpha API and may change in future releases. Fields marked as immutable cannot be changed after resource creation and will be rejected by validation webhooks.