Some checks failed
Run Check Script / check (pull_request) Failing after 12s
160 lines
6.7 KiB
Markdown
160 lines
6.7 KiB
Markdown
# Use Case: OKD on Bare Metal
|
|
|
|
Provision a production-grade OKD (OpenShift Kubernetes Distribution) cluster from physical hardware using Harmony. This use case covers the full lifecycle: hardware discovery, bootstrap, control plane, workers, and post-install validation.
|
|
|
|
## What you'll have at the end
|
|
|
|
A highly-available OKD cluster with:
|
|
- 3 control plane nodes
|
|
- 2+ worker nodes
|
|
- Network bonding configured on nodes and switches
|
|
- Load balancer routing API and ingress traffic
|
|
- DNS and DHCP services for the cluster
|
|
- Post-install health validation
|
|
|
|
## Target hardware model
|
|
|
|
This setup assumes a typical lab environment:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ Network 192.168.x.0/24 (flat, DHCP + PXE capable) │
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ cp0 │ │ cp1 │ │ cp2 │ (control) │
|
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
|
│ ┌──────────┐ ┌──────────┐ │
|
|
│ │ wk0 │ │ wk1 │ ... (workers) │
|
|
│ └──────────┘ └──────────┘ │
|
|
│ ┌──────────┐ │
|
|
│ │ bootstrap│ (temporary, can be repurposed) │
|
|
│ └──────────┘ │
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ │
|
|
│ │ firewall │ │ switch │ (OPNsense + Brocade) │
|
|
│ └──────────┘ └──────────┘ │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Required infrastructure
|
|
|
|
Harmony models this as an `HAClusterTopology`, which requires these capabilities:
|
|
|
|
| Capability | Implementation |
|
|
|------------|---------------|
|
|
| **Router** | OPNsense firewall |
|
|
| **Load Balancer** | OPNsense HAProxy |
|
|
| **Firewall** | OPNsense |
|
|
| **DHCP Server** | OPNsense |
|
|
| **TFTP Server** | OPNsense |
|
|
| **HTTP Server** | OPNsense |
|
|
| **DNS Server** | OPNsense |
|
|
| **Node Exporter** | Prometheus node_exporter on OPNsense |
|
|
| **Switch Client** | Brocade SNMP |
|
|
|
|
See `examples/okd_installation/` for a reference topology implementation.
|
|
|
|
## The Provisioning Pipeline
|
|
|
|
Harmony orchestrates OKD installation in ordered stages:
|
|
|
|
### Stage 1: Inventory Discovery (`OKDSetup01InventoryScore`)
|
|
|
|
Harmony boots all nodes via PXE into a CentOS Stream live environment, runs an inventory agent on each, and collects:
|
|
- MAC addresses and NIC details
|
|
- IP addresses assigned by DHCP
|
|
- Hardware profile (CPU, RAM, storage)
|
|
|
|
This is the "discovery-first" approach: no pre-configuration required on nodes.
|
|
|
|
### Stage 2: Bootstrap Node (`OKDSetup02BootstrapScore`)
|
|
|
|
The user selects one discovered node to serve as the bootstrap node. Harmony:
|
|
- Renders per-MAC iPXE boot configuration with OKD 4.19 SCOS live assets + ignition
|
|
- Reboots the bootstrap node via SSH
|
|
- Waits for the bootstrap process to complete (API server becomes available)
|
|
|
|
### Stage 3: Control Plane (`OKDSetup03ControlPlaneScore`)
|
|
|
|
With bootstrap complete, Harmony provisions the control plane nodes:
|
|
- Renders per-MAC iPXE for each control plane node
|
|
- Reboots via SSH and waits for node to join the cluster
|
|
- Applies network bond configuration via NMState MachineConfig where relevant
|
|
|
|
### Stage 4: Network Bonding (`OKDSetupPersistNetworkBondScore`)
|
|
|
|
Configures LACP bonds on nodes and corresponding port-channels on the switch stack for high-availability.
|
|
|
|
### Stage 5: Worker Nodes (`OKDSetup04WorkersScore`)
|
|
|
|
Provisions worker nodes similarly to control plane, joining them to the cluster.
|
|
|
|
### Stage 6: Sanity Check (`OKDSetup05SanityCheckScore`)
|
|
|
|
Validates:
|
|
- API server is reachable
|
|
- Ingress controller is operational
|
|
- Cluster operators are healthy
|
|
- SDN (software-defined networking) is functional
|
|
|
|
### Stage 7: Installation Report (`OKDSetup06InstallationReportScore`)
|
|
|
|
Produces a machine-readable JSON report and human-readable summary of the installation.
|
|
|
|
## Network notes
|
|
|
|
**During discovery:** Ports must be in access mode (no LACP). DHCP succeeds; iPXE loads CentOS Stream live with Kickstart and starts the inventory endpoint.
|
|
|
|
**During provisioning:** After SCOS is on disk and Ignition/MachineConfig can be applied, bonds are set persistently. This avoids the PXE/DHCP recovery race condition that occurs if bonding is configured too early.
|
|
|
|
**PXE limitation:** The generic discovery path cannot use bonded networks for PXE boot because the DHCP recovery process conflicts with bond formation.
|
|
|
|
## Configuration knobs
|
|
|
|
When using `OKDInstallationPipeline`, configure these domains:
|
|
|
|
| Parameter | Example | Description |
|
|
|-----------|---------|-------------|
|
|
| `public_domain` | `apps.example.com` | Wildcard domain for application ingress |
|
|
| `internal_domain` | `cluster.local` | Internal cluster DNS domain |
|
|
|
|
## Running the example
|
|
|
|
See `examples/okd_installation/` for a complete reference. The topology must be configured with your infrastructure details:
|
|
|
|
```bash
|
|
# Configure the example with your hardware/network specifics
|
|
# See examples/okd_installation/src/topology.rs
|
|
|
|
cargo run -p example-okd_installation
|
|
```
|
|
|
|
This example requires:
|
|
- Physical hardware configured as described above
|
|
- OPNsense firewall with SSH access
|
|
- Brocade switch with SNMP access
|
|
- All nodes connected to the same Layer 2 network
|
|
|
|
## Post-install
|
|
|
|
After the cluster is bootstrapped, `~/.kube/config` is updated with the cluster credentials. Verify:
|
|
|
|
```bash
|
|
kubectl get nodes
|
|
kubectl get pods -n openshift-monitoring
|
|
oc get routes -n openshift-console
|
|
```
|
|
|
|
## Next steps
|
|
|
|
- Enable monitoring with `PrometheusAlertScore` or `OpenshiftClusterAlertScore`
|
|
- Configure TLS certificates with `CertManagerHelmScore`
|
|
- Add storage with Rook Ceph
|
|
- Scale workers with `OKDSetup04WorkersScore`
|
|
|
|
## Further reading
|
|
|
|
- [OKD Installation Module](../../harmony/src/modules/okd/installation.rs) — source of truth for pipeline stages
|
|
- [HAClusterTopology](../../harmony/src/domain/topology/ha_cluster.rs) — infrastructure capability model
|
|
- [Scores Catalog](../catalogs/scores.md) — all available Scores including OKD-specific ones
|