Epic: Fully Orchestrate OKD Installation with Automated Node Discovery and Network Provisioning #113
Labels
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: NationTech/harmony#113
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
This epic covers the work required to create a fully automated, end-to-end orchestration flow for deploying an OKD 4.19 cluster using Harmony. The primary goal is to eliminate manual, error-prone steps such as hardware inventory collection and PXE configuration. The process will leverage a two-phase approach: an initial discovery phase for unknown hardware and a provisioning phase that configures the node and network for its specific role in the cluster.
This implementation will solve the critical challenge of PXE booting nodes on switch ports that are destined to be part of an LACP bond.
Background & Problem Statement
The current process for setting up a new OKD cluster involves several manual and time-consuming tasks:
This epic aims to solve these problems by making the entire process a declarative, orchestrated workflow within Harmony.
Proposed Architecture
The solution is a two-phase process orchestrated by a new
OKDInstallationScore
.Phase 1: Automated Node Discovery (Access Mode)
default.ipxe
configuration from the PXE server.inventory.ks
) file.%post
script downloads and starts a purpose-built, statically-compiled Rust binary: theharmony-inventory-agent
. This agent collects detailed hardware information and exposes it via a JSON HTTP endpoint.Phase 2: Role-Based OKD Provisioning (LACP Mode)
bootstrap
,master-0
,worker-0
) to the discovered node in the Harmony inventory.{MAC}.ipxe
) that points to the SCOS (Stream CoreOS) kernel and an Ignition file tailored for the assigned role.This architecture treats the network configuration as code and ensures that the host and network are always in a compatible state at every stage of the boot process.
To-Do List
1. Initial Setup & Prerequisites
boot.iso
for the inventory environment.vmlinuz
andinitrd.img
from the ISO and place them on the Caddy HTTP server under/os/centos-stream-9/
.rhcos-live-kernel-x86_64
,rhcos-live-initramfs.x86_64.img
, etc.) and place them on the Caddy server under/os/scos-4.19/
.2. Discovery Phase Implementation
harmony-inventory-agent
.axum
oractix-web
) to the agent to serve the collected inventory as a JSON object on/
.Cargo.toml
for a staticx86_64-unknown-linux-musl
build.default.ipxe.tpl
).inventory.ks.tpl
) that enables SSH and downloads/runs theharmony-inventory-agent
binary.3. Network Automation
Switch
trait in Harmony to abstract switch automation tasks (get_port_status
,set_ports_access_mode
,set_ports_lacp_mode
). (I think it already exists somewhere in the domain data types)Switch
trait for the specific switch hardware in use (e.g.,OPNSenseSwitch
,ArubaSwitch
,CiscoSwitch
).SwitchAutomationScore
in Harmony that uses this trait to perform the required port mode changes.4. Provisioning Phase Implementation
scos-boot.ipxe.tpl
).bond=
andrd.bond=
kernel arguments for this template based on the inventoried NICs.bootstrap
,master
,worker
).NMState
orMachineConfig
file definitions to the Ignition templates to configure the network bond persistently. (see official docs)5. Harmony Core Orchestration
OKDInstallationScore
hierarchy:OKDSetup01Inventory
: Manages the discovery phase. Scans the network for new agents.OKDSetup02Bootstrap
: Manages the transition and installation of the bootstrap node.OKDSetup03ControlPlane
: Manages the installation of the control plane nodes.OKDSetup04Workers
: Manages the installation of the worker nodes.SwitchAutomationScore
into the mainOKDInstallationScore
to trigger network changes at the correct time.