This is a comprehensive PoC for using actual KubeVirt VirtualMachine resources with the HyperShift Agent platform, including full ISO boot support.
- Architecture Overview
- Prerequisites
- Phase 1: Management Cluster Setup
- Phase 2: OpenShift Virtualization
- Phase 3: KubeVirtBMC Deployment
- Phase 4: HyperShift and Agent Platform
- Phase 5: KubeVirt Worker VMs
- Phase 6: ISO Boot Configuration
- Phase 7: BareMetalHost Integration
- Phase 8: NodePool Creation
- Verification and Testing
- Troubleshooting
- Clean Up
┌─────────────────────────────────────────────────────────────────┐
│ Management Cluster (OCP) │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ OpenShift Virtualization (KubeVirt) │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ │ VirtualMachine │ │ VirtualMachine │ │ │
│ │ │ worker-0 │ │ worker-1 │ │ │
│ │ │ │ │ │ │ │
│ │ │ ┌──────────────┐ │ │ ┌──────────────┐ │ │ │
│ │ │ │ Discovery ISO│ │ │ │ Discovery ISO│ │ │ │
│ │ │ │ (CD-ROM) │ │ │ │ (CD-ROM) │ │ │ │
│ │ │ │ bootOrder: 1 │ │ │ │ bootOrder: 1 │ │ │ │
│ │ │ └──────────────┘ │ │ └──────────────┘ │ │ │
│ │ │ ┌──────────────┐ │ │ ┌──────────────┐ │ │ │
│ │ │ │ OS Disk │ │ │ │ OS Disk │ │ │ │
│ │ │ │ (120GB) │ │ │ │ (120GB) │ │ │ │
│ │ │ │ bootOrder: 2 │ │ │ │ bootOrder: 2 │ │ │ │
│ │ │ └──────────────┘ │ │ └──────────────┘ │ │ │
│ │ └──────────────────┘ └──────────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ KubeVirtBMC (Virtual BMC Emulation) │ │
│ │ │ │
│ │ ┌────────────────┐ ┌────────────────┐ │ │
│ │ │ VirtualMachine │ │ VirtualMachine │ │ │
│ │ │ BMC (Redfish) │ │ BMC (Redfish) │ │ │
│ │ │ worker-0 │ │ worker-1 │ │ │
│ │ └────────────────┘ └────────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Metal3 + Assisted Service + HyperShift │ │
│ │ │ │
│ │ ┌────────────────┐ ┌────────────────┐ │ │
│ │ │ BareMetalHost │ │ BareMetalHost │ │ │
│ │ │ worker-0 │ │ worker-1 │ │ │
│ │ │ (points to BMC)│ │ (points to BMC)│ │ │
│ │ └────────────────┘ └────────────────┘ │ │
│ │ │ │
│ │ ┌────────────────┐ ┌────────────────┐ │ │
│ │ │ Agent │ │ Agent │ │ │
│ │ │ worker-0 │ │ worker-1 │ │ │
│ │ └────────────────┘ └────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────┐ │ │
│ │ │ NodePool (Agent Platform) │ │ │
│ │ │ - agentLabelSelector │ │ │
│ │ │ - replicas: 2 │ │ │
│ │ └──────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Hosted Control Plane (HCP) │ │
│ │ - etcd, kube-apiserver, etc. │ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
- Hypervisor/Host:
- 128GB RAM minimum (64GB for management cluster, 64GB for worker VMs)
- 16+ CPU cores
- 1TB disk space
- Nested virtualization enabled
# Check nested virtualization
cat /sys/module/kvm_intel/parameters/nested # Should show 'Y'
# or for AMD
cat /sys/module/kvm_amd/parameters/nested
# Enable if disabled
echo "options kvm_intel nested=1" | sudo tee /etc/modprobe.d/kvm.conf
sudo modprobe -r kvm_intel
sudo modprobe kvm_intel# Install kcli
curl -s https://raw.githubusercontent.com/karmab/kcli/main/install.sh | bash
# Verify installation
kcli version- OpenShift Pull Secret: Get from https://console.redhat.com/openshift/install/pull-secret
- SSH Public Key: For accessing VMs
# Save pull secret
cat > openshift_pull.json << 'EOF'
{your-pull-secret-json-here}
EOF
# Generate SSH key if needed
ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa -N ""Create mgmt-cluster.yaml:
plan: mgmt-cluster
force: true
version: stable
tag: "4.17"
cluster: "mgmt-cluster"
domain: hypershiftbm.lab
api_ip: 192.168.125.10
ingress_ip: 192.168.125.11
dualstack: false
disk_size: 200
extra_disks: [200]
memory: 64000 # 64GB for management cluster
numcpus: 16
ctlplanes: 3
workers: 0
metal3: true # CRITICAL: Enable Metal3
network: ipv4
metallb_pool: ipv4-virtual-network
metallb_ranges:
- 192.168.125.150-192.168.125.190
metallb_autoassign: true
apps:
- lvms-operator
- metallb-operator# Deploy
kcli create cluster openshift --pf mgmt-cluster.yaml
# This will take approximately 45 minutes
# Monitor progress
kcli list cluster# Export kubeconfig
export KUBECONFIG=~/.kcli/clusters/mgmt-cluster/auth/kubeconfig
# Verify cluster
oc get nodes
oc get co # Wait for all operators to be Available# Create namespace
cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: openshift-cnv
EOF
# Create OperatorGroup
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-cnv-group
namespace: openshift-cnv
spec:
targetNamespaces:
- openshift-cnv
EOF
# Subscribe to OpenShift Virtualization
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: kubevirt-hyperconverged
namespace: openshift-cnv
spec:
channel: stable
name: kubevirt-hyperconverged
source: redhat-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Automatic
EOF
# Wait for operator installation
echo "Waiting for OpenShift Virtualization operator..."
oc wait --for=condition=Ready -n openshift-cnv subscription/kubevirt-hyperconverged --timeout=600s
# Check CSV (ClusterServiceVersion)
oc get csv -n openshift-cnvcat <<EOF | oc apply -f -
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
name: kubevirt-hyperconverged
namespace: openshift-cnv
spec:
featureGates:
enableCommonBootImageImport: true
deployKubeSecondaryDNS: false
EOF
# Wait for HyperConverged to be ready (this may take 5-10 minutes)
echo "Waiting for HyperConverged deployment..."
oc wait --for=condition=Available -n openshift-cnv hyperconverged/kubevirt-hyperconverged --timeout=900s
# Verify KubeVirt is running
oc get kubevirt -n openshift-cnv
oc get pods -n openshift-cnv# Check that virt-api is running
oc get pods -n openshift-cnv | grep virt-api
# Check that virt-controller is running
oc get pods -n openshift-cnv | grep virt-controller
# Check that virt-handler is running on all nodes
oc get pods -n openshift-cnv | grep virt-handler
# Test virtctl (optional)
curl -L -o /tmp/virtctl https://github.com/kubevirt/kubevirt/releases/download/v1.1.0/virtctl-v1.1.0-linux-amd64
chmod +x /tmp/virtctl
sudo mv /tmp/virtctl /usr/local/bin/virtctl# Clone the repository
git clone https://github.com/starbops/kubevirtbmc.git
cd kubevirtbmc
# Deploy CRDs
oc apply -f config/crd/bases/virtualmachinebmc.bmc.tinkerbell.org_virtualmachinebmcs.yaml
# Deploy RBAC
oc apply -f config/rbac/role.yaml
oc apply -f config/rbac/role_binding.yaml
oc apply -f config/rbac/service_account.yaml
# Deploy manager
oc apply -f config/manager/manager.yaml
# Verify deployment
oc get pods -n kubevirtbmc-system
oc wait --for=condition=Ready -n kubevirtbmc-system pod -l control-plane=controller-manager --timeout=300sAlternative: Deploy using Kustomize
cd kubevirtbmc
oc apply -k config/default
# Verify
oc get all -n kubevirtbmc-system# Check CRD is installed
oc get crd virtualmachinebmcs.virtualmachinebmc.bmc.tinkerbell.org
# Check operator logs
oc logs -n kubevirtbmc-system deployment/kubevirtbmc-controller-manager -f# Allow Metal3 to watch all namespaces (required for BMH outside openshift-machine-api)
oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true}}'
# Wait for metal3 pod to restart
echo "Waiting for metal3 pod to restart..."
sleep 10
# Wait for metal3 to be ready
until oc wait -n openshift-machine-api \
$(oc get pods -n openshift-machine-api -l baremetal.openshift.io/cluster-baremetal-operator=metal3-state -o name) \
--for=condition=Ready --timeout=10s >/dev/null 2>&1; do
echo "Waiting for metal3 pod..."
sleep 5
done
echo "Metal3 is ready!"Option A: Using tasty
# Install tasty
curl -s -L https://github.com/karmab/tasty/releases/download/v0.4.0/tasty-linux-amd64 > /tmp/tasty
chmod +x /tmp/tasty
sudo mv /tmp/tasty /usr/local/bin/tasty
# Install operators
tasty install assisted-service-operator hive-operator
# Wait for operators
oc wait --for=condition=Ready -n multicluster-engine subscription/assisted-service-operator --timeout=600s
oc wait --for=condition=Ready -n multicluster-engine subscription/hive-operator --timeout=600sOption B: Manual Installation via OperatorHub
# Install MultiCluster Engine (includes Assisted Service and Hive)
cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: multicluster-engine
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: multicluster-engine-og
namespace: multicluster-engine
spec:
targetNamespaces:
- multicluster-engine
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: multicluster-engine
namespace: multicluster-engine
spec:
channel: stable-2.7
name: multicluster-engine
source: redhat-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Automatic
EOF
# Wait for installation
oc wait --for=condition=Ready -n multicluster-engine subscription/multicluster-engine --timeout=600sexport DB_VOLUME_SIZE="10Gi"
export FS_VOLUME_SIZE="10Gi"
export OCP_VERSION="4.17.0"
export OCP_MAJMIN=${OCP_VERSION%.*}
export ARCH="x86_64"
export OCP_RELEASE_VERSION=$(curl -s https://mirror.openshift.com/pub/openshift-v4/${ARCH}/clients/ocp/${OCP_VERSION}/release.txt | awk '/machine-os / { print $2 }')
export ISO_URL="https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/${OCP_MAJMIN}/${OCP_VERSION}/rhcos-${OCP_VERSION}-${ARCH}-live.${ARCH}.iso"
export ROOT_FS_URL="https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/${OCP_MAJMIN}/${OCP_VERSION}/rhcos-${OCP_VERSION}-${ARCH}-live-rootfs.${ARCH}.img"
envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
namespace: multicluster-engine
spec:
databaseStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${DB_VOLUME_SIZE}
filesystemStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${FS_VOLUME_SIZE}
osImages:
- openshiftVersion: "${OCP_VERSION}"
version: "${OCP_RELEASE_VERSION}"
url: "${ISO_URL}"
rootFSUrl: "${ROOT_FS_URL}"
cpuArchitecture: "${ARCH}"
EOF
# Wait for AgentServiceConfig to be ready
oc wait --for=condition=DeploymentsHealthy -n multicluster-engine agentserviceconfig/agent --timeout=600s
# Verify assisted-service is running
oc get pods -n multicluster-engine | grep assisted# Get HyperShift CLI
export HYPERSHIFT_RELEASE=4.17
podman cp $(podman create --name hypershift --rm --pull always \
quay.io/hypershift/hypershift-operator:${HYPERSHIFT_RELEASE}):/usr/bin/hypershift /tmp/hypershift
podman rm -f hypershift 2>/dev/null || true
sudo install -m 0755 -o root -g root /tmp/hypershift /usr/local/bin/hypershift
# Verify CLI
hypershift version
# Install HyperShift operator
hypershift install \
--hypershift-image quay.io/hypershift/hypershift-operator:${HYPERSHIFT_RELEASE} \
--enable-defaulting-webhook=false
# Verify installation
oc get pods -n hypershift
oc wait --for=condition=Ready -n hypershift pod -l app=operator --timeout=300sexport WORKER_NAMESPACE="hosted-workers"
oc create namespace ${WORKER_NAMESPACE}Check existing storage classes:
oc get storageclass
# If using LVMS from management cluster setup
export STORAGE_CLASS="lvms-vg1"
# Or use hostpath provisioner for testing
# Note: For production, use a proper storage classSave this as kubevirt-worker-template.yaml:
# This is a template - we'll substitute values for each worker
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: ${WORKER_NAME}
namespace: ${WORKER_NAMESPACE}
labels:
app: hosted-cluster-worker
worker-role: ${WORKER_ROLE}
worker-zone: ${WORKER_ZONE}
spec:
running: false # Metal3 will control this via BMC
template:
metadata:
labels:
kubevirt.io/vm: ${WORKER_NAME}
app: hosted-cluster-worker
spec:
domain:
cpu:
cores: 4
sockets: 1
threads: 1
devices:
disks:
# Discovery ISO - boot first
- name: discovery-iso
bootOrder: 1
cdrom:
bus: sata
readonly: true
# OS disk - boot second (after ISO installation)
- name: os-disk
bootOrder: 2
disk:
bus: virtio
# Cloud-init for network configuration
- name: cloudinitdisk
disk:
bus: virtio
interfaces:
- name: default
bridge: {}
macAddress: "${WORKER_MAC}"
networkInterfaceMultiqueue: true
firmware:
bootloader:
bios:
useSerial: true
machine:
type: q35
resources:
requests:
memory: 16Gi
networks:
- name: default
pod: {}
volumes:
# Discovery ISO (will be created later from InfraEnv)
- name: discovery-iso
persistentVolumeClaim:
claimName: agent-discovery-iso
# OS disk
- name: os-disk
dataVolume:
name: ${WORKER_NAME}-os
# Cloud-init for static IP configuration
- name: cloudinitdisk
cloudInitNoCloud:
networkData: |
version: 2
ethernets:
eth0:
match:
macaddress: "${WORKER_MAC}"
addresses:
- ${WORKER_IP}/24
gateway4: 192.168.125.1
nameservers:
addresses:
- 8.8.8.8
- 1.1.1.1
userData: |
#cloud-config
ssh_authorized_keys:
- ${SSH_PUB_KEY}
---
# DataVolume for OS disk (empty, will be populated by installer)
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: ${WORKER_NAME}-os
namespace: ${WORKER_NAMESPACE}
spec:
source:
blank: {}
pvc:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 120Gi
storageClassName: ${STORAGE_CLASS}export CLUSTERS_NAMESPACE="clusters"
export HOSTED_CLUSTER_NAME="agent-cluster"
export HOSTED_CONTROL_PLANE_NAMESPACE="${CLUSTERS_NAMESPACE}-${HOSTED_CLUSTER_NAME}"
export BASEDOMAIN="hypershiftbm.lab"
export PULL_SECRET_FILE=$PWD/openshift_pull.json
export OCP_RELEASE="4.17.0"
# Create namespace
oc create ns ${HOSTED_CONTROL_PLANE_NAMESPACE}
# Create hosted cluster
hypershift create cluster agent \
--name=${HOSTED_CLUSTER_NAME} \
--pull-secret=${PULL_SECRET_FILE} \
--agent-namespace=${HOSTED_CONTROL_PLANE_NAMESPACE} \
--base-domain=${BASEDOMAIN} \
--api-server-address=api.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN} \
--release-image=quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE}-x86_64 \
--ssh-key=$HOME/.ssh/id_rsa.pub
# Wait for control plane pods
echo "Waiting for Hosted Control Plane pods..."
oc wait --for=condition=Ready -n ${HOSTED_CONTROL_PLANE_NAMESPACE} pod -l app=kube-apiserver --timeout=900s# Get the LoadBalancer IPs from services
export API_IP=$(oc get svc -n ${HOSTED_CONTROL_PLANE_NAMESPACE} kube-apiserver -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export APPS_IP=$(oc get svc -n ${HOSTED_CONTROL_PLANE_NAMESPACE} router-default -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# Add DNS entries (for testing, use /etc/hosts on your workstation)
echo "Add these entries to your /etc/hosts or DNS server:"
echo "${API_IP} api.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}"
echo "${API_IP} api-int.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}"
echo "${APPS_IP} *.apps.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}"
# On Linux/Mac workstation:
sudo tee -a /etc/hosts <<EOF
${API_IP} api.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}
${API_IP} api-int.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}
${APPS_IP} console-openshift-console.apps.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}
${APPS_IP} oauth-openshift.apps.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}
EOFexport SSH_PUB_KEY=$(cat $HOME/.ssh/id_rsa.pub)
envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: ${HOSTED_CLUSTER_NAME}
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
spec:
pullSecretRef:
name: pull-secret
sshAuthorizedKey: ${SSH_PUB_KEY}
nmStateConfigLabelSelector:
matchLabels: {}
EOF
# Wait for ISO to be generated
echo "Waiting for discovery ISO to be generated..."
oc wait --for=condition=ImageCreated -n ${HOSTED_CONTROL_PLANE_NAMESPACE} infraenv/${HOSTED_CLUSTER_NAME} --timeout=600s
# Get ISO URL
export ISO_URL=$(oc -n ${HOSTED_CONTROL_PLANE_NAMESPACE} get infraenv ${HOSTED_CLUSTER_NAME} -o jsonpath='{.status.isoDownloadURL}')
echo "Discovery ISO URL: ${ISO_URL}"# Create DataVolume to import discovery ISO
cat <<EOF | oc apply -f -
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: agent-discovery-iso
namespace: ${WORKER_NAMESPACE}
spec:
source:
http:
url: ${ISO_URL}
pvc:
accessModes:
- ReadOnlyMany # Can be shared by multiple VMs
resources:
requests:
storage: 2Gi
storageClassName: ${STORAGE_CLASS}
EOF
# Wait for ISO import to complete (may take 5-10 minutes)
echo "Importing discovery ISO (this may take a few minutes)..."
oc wait --for=condition=Ready -n ${WORKER_NAMESPACE} dv/agent-discovery-iso --timeout=900s
# Verify PVC was created
oc get pvc -n ${WORKER_NAMESPACE} agent-discovery-isoNow create the actual worker VMs using the template:
export STORAGE_CLASS="lvms-vg1" # Adjust to your storage class
export SSH_PUB_KEY=$(cat $HOME/.ssh/id_rsa.pub)
# Worker 0
export WORKER_NAME="hosted-worker-0"
export WORKER_MAC="52:54:00:aa:bb:01"
export WORKER_IP="192.168.125.201"
export WORKER_ROLE="database"
export WORKER_ZONE="zone-a"
envsubst < kubevirt-worker-template.yaml | oc apply -f -
# Worker 1
export WORKER_NAME="hosted-worker-1"
export WORKER_MAC="52:54:00:aa:bb:02"
export WORKER_IP="192.168.125.202"
export WORKER_ROLE="compute"
export WORKER_ZONE="zone-b"
envsubst < kubevirt-worker-template.yaml | oc apply -f -
# Worker 2
export WORKER_NAME="hosted-worker-2"
export WORKER_MAC="52:54:00:aa:bb:03"
export WORKER_IP="192.168.125.203"
export WORKER_ROLE="compute"
export WORKER_ZONE="zone-c"
envsubst < kubevirt-worker-template.yaml | oc apply -f -
# Wait for DataVolumes to be ready
echo "Waiting for worker OS disks to be provisioned..."
oc wait --for=condition=Ready -n ${WORKER_NAMESPACE} dv/hosted-worker-0-os --timeout=300s
oc wait --for=condition=Ready -n ${WORKER_NAMESPACE} dv/hosted-worker-1-os --timeout=300s
oc wait --for=condition=Ready -n ${WORKER_NAMESPACE} dv/hosted-worker-2-os --timeout=300s
# Verify VMs are created (but not running)
oc get vm -n ${WORKER_NAMESPACE}
oc get vmi -n ${WORKER_NAMESPACE} # Should be empty (VMs not started yet)For each worker VM, create a BMC:
# Worker 0
cat <<EOF | oc apply -f -
apiVersion: virtualmachinebmc.bmc.tinkerbell.org/v1alpha1
kind: VirtualMachineBMC
metadata:
name: hosted-worker-0-bmc
namespace: ${WORKER_NAMESPACE}
spec:
virtualMachineName: hosted-worker-0
virtualMachineNamespace: ${WORKER_NAMESPACE}
protocol: redfish
credentials:
username: admin
password: password
EOF
# Worker 1
cat <<EOF | oc apply -f -
apiVersion: virtualmachinebmc.bmc.tinkerbell.org/v1alpha1
kind: VirtualMachineBMC
metadata:
name: hosted-worker-1-bmc
namespace: ${WORKER_NAMESPACE}
spec:
virtualMachineName: hosted-worker-1
virtualMachineNamespace: ${WORKER_NAMESPACE}
protocol: redfish
credentials:
username: admin
password: password
EOF
# Worker 2
cat <<EOF | oc apply -f -
apiVersion: virtualmachinebmc.bmc.tinkerbell.org/v1alpha1
kind: VirtualMachineBMC
metadata:
name: hosted-worker-2-bmc
namespace: ${WORKER_NAMESPACE}
spec:
virtualMachineName: hosted-worker-2
virtualMachineNamespace: ${WORKER_NAMESPACE}
protocol: redfish
credentials:
username: admin
password: password
EOF
# Verify BMC services are created
oc get svc -n ${WORKER_NAMESPACE} -l virtualmachinebmc.bmc.tinkerbell.org/name# Function to get BMC endpoint
get_bmc_endpoint() {
local worker_name=$1
local svc_name=$(oc get svc -n ${WORKER_NAMESPACE} \
-l virtualmachinebmc.bmc.tinkerbell.org/name=${worker_name}-bmc \
-o jsonpath='{.items[0].metadata.name}')
local bmc_ip=$(oc get svc -n ${WORKER_NAMESPACE} ${svc_name} \
-o jsonpath='{.spec.clusterIP}')
local bmc_port=$(oc get svc -n ${WORKER_NAMESPACE} ${svc_name} \
-o jsonpath='{.spec.ports[0].port}')
echo "redfish://${bmc_ip}:${bmc_port}/redfish/v1/Systems/1"
}
# Get endpoints for all workers
export BMC_WORKER_0=$(get_bmc_endpoint hosted-worker-0)
export BMC_WORKER_1=$(get_bmc_endpoint hosted-worker-1)
export BMC_WORKER_2=$(get_bmc_endpoint hosted-worker-2)
echo "Worker 0 BMC: ${BMC_WORKER_0}"
echo "Worker 1 BMC: ${BMC_WORKER_1}"
echo "Worker 2 BMC: ${BMC_WORKER_2}"
# Test BMC endpoint (optional)
curl -k -u admin:password ${BMC_WORKER_0}/redfish/v1 | jqexport BMC_USERNAME=$(echo -n "admin" | base64 -w0)
export BMC_PASSWORD=$(echo -n "password" | base64 -w0)
# Worker 0
export WORKER_NAME="hosted-worker-0"
export WORKER_MAC="52:54:00:aa:bb:01"
export BMC_ENDPOINT="${BMC_WORKER_0}"
cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
name: ${WORKER_NAME}-bmc-secret
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
type: Opaque
data:
username: ${BMC_USERNAME}
password: ${BMC_PASSWORD}
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ${WORKER_NAME}
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
labels:
infraenvs.agent-install.openshift.io: ${HOSTED_CLUSTER_NAME}
worker-role: database
worker-zone: zone-a
annotations:
inspect.metal3.io: disabled
bmac.agent-install.openshift.io/hostname: ${WORKER_NAME}.${BASEDOMAIN}
spec:
automatedCleaningMode: disabled
online: true
bootMACAddress: "${WORKER_MAC}"
bmc:
address: ${BMC_ENDPOINT}
credentialsName: ${WORKER_NAME}-bmc-secret
disableCertificateVerification: true
EOF
# Worker 1
export WORKER_NAME="hosted-worker-1"
export WORKER_MAC="52:54:00:aa:bb:02"
export BMC_ENDPOINT="${BMC_WORKER_1}"
cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
name: ${WORKER_NAME}-bmc-secret
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
type: Opaque
data:
username: ${BMC_USERNAME}
password: ${BMC_PASSWORD}
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ${WORKER_NAME}
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
labels:
infraenvs.agent-install.openshift.io: ${HOSTED_CLUSTER_NAME}
worker-role: compute
worker-zone: zone-b
annotations:
inspect.metal3.io: disabled
bmac.agent-install.openshift.io/hostname: ${WORKER_NAME}.${BASEDOMAIN}
spec:
automatedCleaningMode: disabled
online: true
bootMACAddress: "${WORKER_MAC}"
bmc:
address: ${BMC_ENDPOINT}
credentialsName: ${WORKER_NAME}-bmc-secret
disableCertificateVerification: true
EOF
# Worker 2
export WORKER_NAME="hosted-worker-2"
export WORKER_MAC="52:54:00:aa:bb:03"
export BMC_ENDPOINT="${BMC_WORKER_2}"
cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
name: ${WORKER_NAME}-bmc-secret
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
type: Opaque
data:
username: ${BMC_USERNAME}
password: ${BMC_PASSWORD}
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ${WORKER_NAME}
namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
labels:
infraenvs.agent-install.openshift.io: ${HOSTED_CLUSTER_NAME}
worker-role: compute
worker-zone: zone-c
annotations:
inspect.metal3.io: disabled
bmac.agent-install.openshift.io/hostname: ${WORKER_NAME}.${BASEDOMAIN}
spec:
automatedCleaningMode: disabled
online: true
bootMACAddress: "${WORKER_MAC}"
bmc:
address: ${BMC_ENDPOINT}
credentialsName: ${WORKER_NAME}-bmc-secret
disableCertificateVerification: true
EOF# Watch BareMetalHosts
watch -n 5 "oc get bmh -n ${HOSTED_CONTROL_PLANE_NAMESPACE}"
# Expected progression:
# - registering → provisioning → provisioned
# In another terminal, watch KubeVirt VMs
watch -n 5 "oc get vm -n ${WORKER_NAMESPACE}"
# VMs should start (running: true) when Metal3 powers them on
# Watch for Agents to appear
watch -n 5 "oc get agents -n ${HOSTED_CONTROL_PLANE_NAMESPACE}"
# Check Agent details with BMH mapping
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -o jsonpath='{range .items[*]}BMH: {@.metadata.labels.agent-install\.openshift\.io/bmh} Agent: {@.metadata.name} State: {@.status.debugInfo.state} Approved: {@.spec.approved}{"\n"}{end}'cat <<EOF | oc apply -f -
apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
name: ${HOSTED_CLUSTER_NAME}-workers
namespace: ${CLUSTERS_NAMESPACE}
spec:
clusterName: ${HOSTED_CLUSTER_NAME}
replicas: 3
management:
autoRepair: false
upgradeType: InPlace
platform:
type: Agent
agent:
agentLabelSelector:
matchLabels: {} # Select any available agent
release:
image: quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE}-x86_64
EOFAlternative: Create multiple NodePools targeting specific workers:
# NodePool for database workers (zone-a)
cat <<EOF | oc apply -f -
apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
name: ${HOSTED_CLUSTER_NAME}-db
namespace: ${CLUSTERS_NAMESPACE}
spec:
clusterName: ${HOSTED_CLUSTER_NAME}
replicas: 1
management:
autoRepair: false
upgradeType: InPlace
platform:
type: Agent
agent:
agentLabelSelector:
matchLabels:
worker-role: database
worker-zone: zone-a
release:
image: quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE}-x86_64
EOF
# NodePool for compute workers (zones b and c)
cat <<EOF | oc apply -f -
apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
name: ${HOSTED_CLUSTER_NAME}-compute
namespace: ${CLUSTERS_NAMESPACE}
spec:
clusterName: ${HOSTED_CLUSTER_NAME}
replicas: 2
management:
autoRepair: false
upgradeType: InPlace
platform:
type: Agent
agent:
agentLabelSelector:
matchLabels:
worker-role: compute
release:
image: quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE}-x86_64
EOF# Watch NodePool status
watch -n 5 "oc get nodepool -n ${CLUSTERS_NAMESPACE}"
# Watch Agents binding to Machines
watch -n 5 "oc get agents -n ${HOSTED_CONTROL_PLANE_NAMESPACE}"
# Check Machine creation
watch -n 5 "oc get machines -n ${HOSTED_CONTROL_PLANE_NAMESPACE}"
# Expected Agent states progression:
# insufficient → known-unbound → binding → installing → installing-in-progress → added-to-existing-cluster
# Check detailed agent state
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -o jsonpath='{range .items[*]}Agent: {@.metadata.name} BMH: {@.metadata.labels.agent-install\.openshift\.io/bmh} State: {@.status.debugInfo.state} Progress: {@.status.progress.progressInfo}{"\n"}{end}'# Generate kubeconfig for hosted cluster
hypershift create kubeconfig --name=${HOSTED_CLUSTER_NAME} > ${HOSTED_CLUSTER_NAME}-kubeconfig
# Use hosted cluster kubeconfig
export KUBECONFIG=${HOSTED_CLUSTER_NAME}-kubeconfig
# Wait for nodes to appear
watch -n 5 "oc get nodes"# Check nodes
oc get nodes -o wide
# Verify node labels from BareMetalHost
oc get nodes --show-labels | grep worker-
# Check that static IPs are assigned
oc get nodes -o custom-columns=NAME:.metadata.name,IP:.status.addresses[0].address
# Verify nodes have correct zone labels
oc get nodes -L worker-zone,worker-role# Create test deployment
oc create deployment nginx --image=nginxinc/nginx-unprivileged:latest --replicas=3
# Wait for pods
oc wait --for=condition=Ready pod -l app=nginx --timeout=300s
# Check pod distribution across workers
oc get pods -o wide -l app=nginx
# Verify pods are using the worker nodes
oc get pods -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName -l app=nginx
# Clean up
oc delete deployment nginx# Check VirtualMachine boot order
oc get vm -n ${WORKER_NAMESPACE} hosted-worker-0 -o yaml | grep -A 10 bootOrder
# Check that VMs booted from ISO then disk
# After installation, VMs should be booting from disk (bootOrder: 2)
# Check VM console logs (optional)
virtctl console hosted-worker-0 -n ${WORKER_NAMESPACE}
# Press Ctrl+] to exit# Check DataVolume status
oc describe dv agent-discovery-iso -n ${WORKER_NAMESPACE}
# Check CDI importer pod logs
oc logs -n ${WORKER_NAMESPACE} $(oc get pods -n ${WORKER_NAMESPACE} -l app=containerized-data-importer -o name | head -1)
# Common issue: ISO URL not accessible from pod
# Solution: Verify ISO URL is accessible
curl -I ${ISO_URL}
# If URL is not accessible, manually download and upload
curl -L ${ISO_URL} -o /tmp/discovery.iso
virtctl image-upload dv agent-discovery-iso \
--image-path=/tmp/discovery.iso \
--size=2Gi \
--storage-class=${STORAGE_CLASS} \
-n ${WORKER_NAMESPACE}# Check KubeVirtBMC logs
oc logs -n kubevirtbmc-system deployment/kubevirtbmc-controller-manager -f
# Check VirtualMachineBMC status
oc describe virtualmachinebmc -n ${WORKER_NAMESPACE}
# Manually test BMC endpoint
export BMC_IP=$(oc get svc -n ${WORKER_NAMESPACE} -l virtualmachinebmc.bmc.tinkerbell.org/name=hosted-worker-0-bmc -o jsonpath='{.items[0].spec.clusterIP}')
export BMC_PORT=$(oc get svc -n ${WORKER_NAMESPACE} -l virtualmachinebmc.bmc.tinkerbell.org/name=hosted-worker-0-bmc -o jsonpath='{.items[0].spec.ports[0].port}')
# Test Redfish API
curl -k -u admin:password http://${BMC_IP}:${BMC_PORT}/redfish/v1/Systems/1 | jq
# Test power on
curl -k -u admin:password -X POST \
-H "Content-Type: application/json" \
-d '{"ResetType":"On"}' \
http://${BMC_IP}:${BMC_PORT}/redfish/v1/Systems/1/Actions/ComputerSystem.Reset
# Check if VM started
oc get vm -n ${WORKER_NAMESPACE}
oc get vmi -n ${WORKER_NAMESPACE}# Check InfraEnv status
oc describe infraenv ${HOSTED_CLUSTER_NAME} -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
# Check BareMetalHost status
oc describe bmh -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
# Check Metal3 logs
oc logs -n openshift-machine-api deployment/metal3
# Check if VMs are actually running
oc get vmi -n ${WORKER_NAMESPACE}
# Access VM console to see boot process
virtctl console hosted-worker-0 -n ${WORKER_NAMESPACE}
# Check VM is booting from ISO
# You should see the discovery agent starting# Check Agent validation errors
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -o yaml | grep -A 20 validationsInfo
# Common issues:
# - Insufficient memory (need 16GB)
# - Insufficient CPU (need 4 cores)
# - No installation disk
# Verify VM resources
oc get vm -n ${WORKER_NAMESPACE} hosted-worker-0 -o jsonpath='{.spec.template.spec.domain.resources}'
# Check disk availability
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.inventory.disks}{"\n"}{end}'# Check Agent labels match NodePool selector
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} --show-labels
# Check NodePool selector
oc get nodepool -n ${CLUSTERS_NAMESPACE} ${HOSTED_CLUSTER_NAME}-workers -o yaml | grep -A 5 agentLabelSelector
# Manually label Agents if needed
oc label agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} <agent-name> worker-role=database
# Check if Agents are approved
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -o custom-columns=NAME:.metadata.name,APPROVED:.spec.approved,STATE:.status.debugInfo.state
# Approve Agents if needed
oc patch agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} <agent-name> -p '{"spec":{"approved":true}}' --type merge# Check Agent installation progress
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.progress.progressInfo}{"\n"}{end}'
# Check assisted-service logs
oc logs -n multicluster-engine deployment/assisted-service -f
# Check installer pod logs (if agent reached installing state)
oc get pods -n ${HOSTED_CONTROL_PLANE_NAMESPACE} -l app=assisted-installer
# Access VM console to see installation
virtctl console hosted-worker-0 -n ${WORKER_NAMESPACE}# Check VM boot order
oc get vm -n ${WORKER_NAMESPACE} hosted-worker-0 -o yaml | grep -A 20 bootOrder
# Verify OS disk has content
oc get pvc -n ${WORKER_NAMESPACE} | grep os
# Check VM is running
oc get vmi -n ${WORKER_NAMESPACE}
# Access console
virtctl console hosted-worker-0 -n ${WORKER_NAMESPACE}
# If stuck on ISO, try ejecting CD-ROM
# This is a KubeVirt limitation - may need to update VM spec to remove ISO after installation# Check cloud-init in VM spec
oc get vm -n ${WORKER_NAMESPACE} hosted-worker-0 -o yaml | grep -A 30 cloudInitNoCloud
# Check inside VM (via console)
virtctl console hosted-worker-0 -n ${WORKER_NAMESPACE}
# Once logged in:
ip addr show
cat /etc/sysconfig/network-scripts/ifcfg-eth0 # RHEL/CentOS
networkctl status # systemd-networkd# Management cluster resources
oc get hostedcluster -n ${CLUSTERS_NAMESPACE}
oc get nodepool -n ${CLUSTERS_NAMESPACE}
oc get bmh -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
oc get agent -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
oc get infraenv -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
oc get machines -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
# KubeVirt resources
oc get vm -n ${WORKER_NAMESPACE}
oc get vmi -n ${WORKER_NAMESPACE}
oc get dv -n ${WORKER_NAMESPACE}
oc get pvc -n ${WORKER_NAMESPACE}
oc get virtualmachinebmc -n ${WORKER_NAMESPACE}
# Logs
oc logs -n hypershift deployment/operator
oc logs -n ${HOSTED_CONTROL_PLANE_NAMESPACE} deployment/capi-provider
oc logs -n openshift-machine-api deployment/metal3
oc logs -n multicluster-engine deployment/assisted-service
oc logs -n kubevirtbmc-system deployment/kubevirtbmc-controller-manager# Delete NodePool
oc delete nodepool -n ${CLUSTERS_NAMESPACE} ${HOSTED_CLUSTER_NAME}-workers
# Delete HostedCluster
hypershift destroy cluster agent \
--name=${HOSTED_CLUSTER_NAME} \
--namespace=${CLUSTERS_NAMESPACE}
# Delete BareMetalHosts
oc delete bmh -n ${HOSTED_CONTROL_PLANE_NAMESPACE} --all
# Delete KubeVirt VMs
oc delete vm -n ${WORKER_NAMESPACE} --all
# Delete DataVolumes
oc delete dv -n ${WORKER_NAMESPACE} --all
# Delete VirtualMachineBMC
oc delete virtualmachinebmc -n ${WORKER_NAMESPACE} --all
# Delete namespaces
oc delete namespace ${WORKER_NAMESPACE}
oc delete namespace ${HOSTED_CONTROL_PLANE_NAMESPACE}
# Delete management cluster (if needed)
kcli delete cluster mgmt-clusterThis PoC demonstrates:
✅ True KubeVirt Integration - VirtualMachine CRs, not libvirt VMs ✅ ISO Boot Support - Discovery ISO mounted as CD-ROM with proper boot order ✅ Virtual BMC Control - KubeVirtBMC provides Redfish API for Metal3 ✅ Static IP Configuration - Cloud-init for network pre-configuration ✅ Label-Based Selection - Target specific VMs via Agent labels ✅ Kubernetes-Native - Everything managed via Kubernetes APIs
- VM Created →
running: false, ISO mounted as CD-ROM (bootOrder: 1) - Metal3/BMC Powers On → VM starts, boots from ISO
- Discovery Agent Runs → Registers as Agent in Kubernetes
- NodePool Selects Agent → Based on labels
- Installation Begins → OS written to disk (bootOrder: 2)
- VM Reboots → Boots from disk (installed OS)
- Node Joins Cluster → Worker node ready
| Aspect | Libvirt VMs | KubeVirt VMs |
|---|---|---|
| Management | virsh, kcli | kubectl/oc |
| Declarative | Limited | Full GitOps |
| RBAC | Host-level | Kubernetes RBAC |
| Storage | Host filesystem | PVCs, CSI |
| Networking | Libvirt networks | Pod networking, Multus |
| Live Migration | Manual | KubeVirt native |
| Integration | External to K8s | Native K8s resources |
This approach provides a production-ready pattern for using KubeVirt VMs as Agent platform workers in HyperShift!