Files
harmony/docs/use-cases/okd-on-bare-metal.md
Jean-Gabriel Gill-Couture 444fea81b8
Some checks failed
Run Check Script / check (pull_request) Failing after 12s
docs: Fix examples cli in docs
2026-03-19 22:52:05 -04:00

6.7 KiB

Use Case: OKD on Bare Metal

Provision a production-grade OKD (OpenShift Kubernetes Distribution) cluster from physical hardware using Harmony. This use case covers the full lifecycle: hardware discovery, bootstrap, control plane, workers, and post-install validation.

What you'll have at the end

A highly-available OKD cluster with:

  • 3 control plane nodes
  • 2+ worker nodes
  • Network bonding configured on nodes and switches
  • Load balancer routing API and ingress traffic
  • DNS and DHCP services for the cluster
  • Post-install health validation

Target hardware model

This setup assumes a typical lab environment:

┌─────────────────────────────────────────────────────────┐
│  Network 192.168.x.0/24 (flat, DHCP + PXE capable)     │
│                                                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐            │
│  │  cp0     │  │  cp1     │  │  cp2     │  (control) │
│  └──────────┘  └──────────┘  └──────────┘            │
│  ┌──────────┐  ┌──────────┐                            │
│  │  wk0     │  │  wk1     │  ...        (workers)    │
│  └──────────┘  └──────────┘                            │
│  ┌──────────┐                                          │
│  │ bootstrap│  (temporary, can be repurposed)          │
│  └──────────┘                                          │
│                                                         │
│  ┌──────────┐  ┌──────────┐                            │
│  │ firewall │  │  switch  │  (OPNsense + Brocade)     │
│  └──────────┘  └──────────┘                            │
└─────────────────────────────────────────────────────────┘

Required infrastructure

Harmony models this as an HAClusterTopology, which requires these capabilities:

Capability Implementation
Router OPNsense firewall
Load Balancer OPNsense HAProxy
Firewall OPNsense
DHCP Server OPNsense
TFTP Server OPNsense
HTTP Server OPNsense
DNS Server OPNsense
Node Exporter Prometheus node_exporter on OPNsense
Switch Client Brocade SNMP

See examples/okd_installation/ for a reference topology implementation.

The Provisioning Pipeline

Harmony orchestrates OKD installation in ordered stages:

Stage 1: Inventory Discovery (OKDSetup01InventoryScore)

Harmony boots all nodes via PXE into a CentOS Stream live environment, runs an inventory agent on each, and collects:

  • MAC addresses and NIC details
  • IP addresses assigned by DHCP
  • Hardware profile (CPU, RAM, storage)

This is the "discovery-first" approach: no pre-configuration required on nodes.

Stage 2: Bootstrap Node (OKDSetup02BootstrapScore)

The user selects one discovered node to serve as the bootstrap node. Harmony:

  • Renders per-MAC iPXE boot configuration with OKD 4.19 SCOS live assets + ignition
  • Reboots the bootstrap node via SSH
  • Waits for the bootstrap process to complete (API server becomes available)

Stage 3: Control Plane (OKDSetup03ControlPlaneScore)

With bootstrap complete, Harmony provisions the control plane nodes:

  • Renders per-MAC iPXE for each control plane node
  • Reboots via SSH and waits for node to join the cluster
  • Applies network bond configuration via NMState MachineConfig where relevant

Stage 4: Network Bonding (OKDSetupPersistNetworkBondScore)

Configures LACP bonds on nodes and corresponding port-channels on the switch stack for high-availability.

Stage 5: Worker Nodes (OKDSetup04WorkersScore)

Provisions worker nodes similarly to control plane, joining them to the cluster.

Stage 6: Sanity Check (OKDSetup05SanityCheckScore)

Validates:

  • API server is reachable
  • Ingress controller is operational
  • Cluster operators are healthy
  • SDN (software-defined networking) is functional

Stage 7: Installation Report (OKDSetup06InstallationReportScore)

Produces a machine-readable JSON report and human-readable summary of the installation.

Network notes

During discovery: Ports must be in access mode (no LACP). DHCP succeeds; iPXE loads CentOS Stream live with Kickstart and starts the inventory endpoint.

During provisioning: After SCOS is on disk and Ignition/MachineConfig can be applied, bonds are set persistently. This avoids the PXE/DHCP recovery race condition that occurs if bonding is configured too early.

PXE limitation: The generic discovery path cannot use bonded networks for PXE boot because the DHCP recovery process conflicts with bond formation.

Configuration knobs

When using OKDInstallationPipeline, configure these domains:

Parameter Example Description
public_domain apps.example.com Wildcard domain for application ingress
internal_domain cluster.local Internal cluster DNS domain

Running the example

See examples/okd_installation/ for a complete reference. The topology must be configured with your infrastructure details:

# Configure the example with your hardware/network specifics
# See examples/okd_installation/src/topology.rs

cargo run -p example-okd_installation

This example requires:

  • Physical hardware configured as described above
  • OPNsense firewall with SSH access
  • Brocade switch with SNMP access
  • All nodes connected to the same Layer 2 network

Post-install

After the cluster is bootstrapped, ~/.kube/config is updated with the cluster credentials. Verify:

kubectl get nodes
kubectl get pods -n openshift-monitoring
oc get routes -n openshift-console

Next steps

  • Enable monitoring with PrometheusAlertScore or OpenshiftClusterAlertScore
  • Configure TLS certificates with CertManagerHelmScore
  • Add storage with Rook Ceph
  • Scale workers with OKDSetup04WorkersScore

Further reading