Add Phase 7 (OPNsense & Bare-Metal Network Automation) tracking current progress on OPNsense Scores, codegen, and Brocade integration. Details the UpdateHostScore requirement and HostNetworkConfigurationScore rework needed for LAGG LACP 802.3ad. Add Phase 8 (HA OKD Production Deployment) describing the target architecture with LAGG/CARP/multi-WAN/BINAT and validation checklist. Update current state section to reflect opnsense-codegen branch progress.
2.3 KiB
2.3 KiB
Phase 8: HA OKD Production Deployment
Goal
Deploy a production HAClusterTopology OKD cluster in UPI mode with full LAGG LACP 802.3ad, CARP VIP, multi-WAN, and BINAT for customer traffic — entirely automated through Harmony Scores.
Status: Not Started
Prerequisites
- Phase 7 (OPNsense & Bare-Metal) substantially complete
- Brocade branch merged and adapted
- UpdateHostScore implemented and tested
Deployment Stack
Network Layer (OPNsense)
- LAGG interfaces (802.3ad LACP) for all cluster hosts — redundant links via LaggScore
- CARP VIPs for high availability — failover IPs via VipScore
- Multi-WAN configuration — multiple uplinks with gateway groups
- BINAT for customer-facing IPs — 1:1 NAT via BinatScore
- Firewall rules per-customer with proper source/dest filtering via FirewallRuleScore
- Outbound NAT for cluster egress via OutboundNatScore
Switch Layer (Brocade)
- VLAN per network segment (management, cluster, customer, storage)
- Port-channels (LACP) matching OPNsense LAGG interfaces
- Interface speed configuration for 10G/40G links
Host Layer
- PXE boot via UpdateHostScore (MAC → DHCP → TFTP → iPXE → SCOS)
- Network bonds (LACP) via reworked HostNetworkConfigurationScore
- NMState for persistent bond configuration on OpenShift nodes
Cluster Layer
- OKD UPI installation via existing OKDSetup01-04 Scores
- HAProxy load balancer for API and ingress via LoadBalancerScore
- DNS via OKDDnsScore
- Monitoring via NodeExporterScore + Prometheus stack
New Scores Needed
- UpdateHostScore — Update MAC in DHCP, configure PXE boot, prepare host network for LAGG LACP
- MultiWanScore — Configure OPNsense gateway groups for multi-WAN failover
- CustomerBinatScore (optional) — Higher-level Score combining BinatScore + FirewallRuleScore + DnatScore per customer
Validation Checklist
- All hosts PXE boot successfully after MAC update
- LAGG/LACP active on all host links (verify via
teamdctlornmcli) - CARP VIPs fail over within expected time window
- BINAT customers reachable from external networks
- Multi-WAN failover tested (pull one uplink, verify traffic shifts)
- Full OKD installation completes end-to-end
- Cluster API accessible via CARP VIP
- Customer workloads routable via BINAT