Skip to content

Instantly share code, notes, and snippets.

@fmount
Last active February 26, 2026 10:49
Show Gist options
  • Select an option

  • Save fmount/17b13541821b54c2ce3a6977ec71587d to your computer and use it in GitHub Desktop.

Select an option

Save fmount/17b13541821b54c2ce3a6977ec71587d to your computer and use it in GitHub Desktop.

OpenStack Probe Configuration Management

This document describes a unified approach to managing Kubernetes probes in RHOSO (liveness, readiness, and startup) across service operators, based on an existing lib-common module [0].

Overview

The probe configuration system addresses two key aspects:

  1. User customization: How services can define overrides for probe configurations
  2. Operator consumption: How operators can consume these overrides through a consistent, type-safe interface

This approach consolidates probe management logic into a reusable library module, reducing code duplication and improving maintainability across all service operators.

Design Decisions

Overriding Probes

Initial Consideration: Custom Resource Approach

Initially, I considered following the topology pattern by introducing a dedicated Custom Resource (CR) to serve as an "AdvancedConfig" CRD. This CR would have grouped together customizable parameters from PodSpec and other sources, then referenced by the main CR.

Chosen Approach: Inline Override Extension

However, since probe configurations consist primarily of primitive types (integers, strings), I opted to extend the existing override struct instead [1]. This decision offers several advantages:

  • No new CRs required: Avoids introducing additional Custom Resource Definitions
  • Consistency: Reuses the familiar override interface pattern
  • Minimal footprint: minimal impact on CRD size
  • Simplicity: Reduces cognitive overhead for users already familiar with the override pattern

Implementation Requirements

The implementation [2] satisfies three core requirements:

1. Default Behavior

When no override is specified, service operators apply their predefined defaults. For example, Cinder [3] defines a DefaultProbeConf with the following baseline values:

// DefaultProbeConf - Default values applied to Cinder StatefulSets when no
// overrides are provided
var DefaultProbeConf = probes.OverrideSpec{
	LivenessProbes: &probes.ProbeConf{
		Path:                "/healthcheck",
		InitialDelaySeconds: 10,
		PeriodSeconds:       10,
		TimeoutSeconds:      10,
	},
	ReadinessProbes: &probes.ProbeConf{
		Path:                "/healthcheck",
		InitialDelaySeconds: 10,
		PeriodSeconds:       10,
		TimeoutSeconds:      10,
	},
	StartupProbes: &probes.ProbeConf{
		TimeoutSeconds:      5,
		FailureThreshold:    12,
		PeriodSeconds:       5,
		InitialDelaySeconds: 5,
	},
}

These defaults ensure that services have production-ready probe configurations out of the box.

2. Granular Field-Level Overrides

Users can override individual fields while preserving default values for all other fields. This granular control enables customization without verbose configuration:

cinderAPI:
  override:
    probes:
      livenessProbes:
        initialDelaySeconds: 20

In this example, only initialDelaySeconds is modified to 20; all other probe parameters (path, periodSeconds, timeoutSeconds, etc.) retain their default values. This merge behavior prevents users from having to redefine entire probe configurations just to tweak a single parameter.

3. Validation via Webhooks

Admission webhooks validate override values to prevent invalid configurations from being applied. This ensures that probe settings remain within acceptable bounds and catch configuration errors early in the deployment lifecycle.

Architecture

lib-common Foundation

The lib-common/modules/common/probes module provides the foundational components for probe management:

Core Components

  • ProbeConf: Configuration struct defining all probe parameters including path, timeouts, periods, and failure thresholds

  • OverrideSpec: Struct that holds configurations for all three probe types (liveness, readiness, startup)

  • ProbeOverrides (interface): Standard contract that any service specification can implement, ensuring consistency across operators

  • ProbeSet: A tuple containing configured probe objects (liveness, readiness, startup) ready for injection into StatefulSet or Deployment manifests

  • CreateProbeSet(): Factory function that intelligently merges user-provided overrides with operator-defined defaults, implementing the field-level override logic

  • ValidateProbes(): Validation function with webhook-compatible error reporting, ensuring probe configurations meet Kubernetes requirements and operator-specific constraints

Service Operator Integration

Example: Cinder Operator

Service operators like Cinder can leverage the probe module and reduce the generation to a simple call:

probes, err := probes.CreateProbeSet(
    int32(cinder.CinderPublicPort),
    &scheme,
    instance.Spec.Override,        // User customizations
    cinder.DefaultProbeConf,       // Operator defaults
)

This single function call:

  1. Takes the user's override specification
  2. Merges it with the operator's defaults
  3. Returns a complete ProbeSet ready for StatefulSet creation

Benefits:

  • Simplified logic: StatefulSet creation code becomes cleaner and more focused
  • Common interface: All operators use the same probe configuration pattern
  • Increased maintainability: Probe logic is centralized, reducing duplication
  • Reduced code duplication: Common probe handling code exists in one place

API Level Integration

Each service component defines (or reuses) an override specification struct that embeds the common OverrideSpec:

// SchedulerOverrideSpec to override the generated manifest of several child resources.
type SchedulerOverrideSpec struct {
	// Override probes and other common fields in the StatefulSet
	probes.OverrideSpec `json:"probes,omitempty"`
}

The OverrideSpec is defined in lib-common and provides a consistent schema across all operators:

// OverrideSpec to override StatefulSet fields
type OverrideSpec struct {
	// Override configuration for the StatefulSet like Probes and other tunable
	// fields
	LivenessProbes  *ProbeConf `json:"livenessProbes,omitempty"`
	ReadinessProbes *ProbeConf `json:"readinessProbes,omitempty"`
	StartupProbes   *ProbeConf `json:"startupProbes,omitempty"`
}

Key Properties:

  • Automatic interface implementation: Any API struct that embeds OverrideSpec automatically implements the ProbeOverrides interface
  • Generic function compatibility: The CreateProbeSet() function accepts any type implementing ProbeOverrides, enabling type-safe generic usage without dependency on specific instance types
  • Uniform schema: All operators expose the same probe configuration fields to users, creating a consistent user experience

Benefits

This unified probe configuration approach delivers:

  1. Consistency: Identical probe configuration experience across all service operators
  2. Maintainability: Centralized probe logic in lib-common reduces maintenance effort
  3. Flexibility: Granular field-level overrides provide control without too much verbosity
  4. Simplicity: No need to learn a new CR by reusing the existing override pattern
  5. Efficiency: Minimal CRD footprint and reduced code duplication

CRD Size comparison

CRD Before (no probes) After (with probes) Growth Lines Probe %
cinders.yaml 144K 164K +20K 2,691 +16%
cinderapis.yaml 112K 116K +4K 1,718 +4%
cinderschedulers.yaml 96K 100K +4K 1,530 +4%
cindervolumes.yaml 96K 100K +4K 1,531 +4%
cinderbackups.yaml 96K 100K +4K 1,530 +4%
TOTAL 544K 580K +36K - +7%

References

[0] Existing lib-common probes module

[1] Probe common lib-common module

[2] Cinder example

[3] Glance example

@gibizer
Copy link

gibizer commented Feb 19, 2026

We might want to consider the effect of the APITimeout configurable on the probe values.

@fmount
Copy link
Author

fmount commented Feb 26, 2026

We might want to consider the effect of the APITimeout configurable on the probe values.

It makes sense, and I'd defer to each operator where the APITimeout parameter is exposed to account for it in their logic (e.g. at webhook layer), accepting, rejecting or adjusting the probe overrides accordingly.
This way we can keep the interface clean and while allowing additional logic on a per service basis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment