Add Phase 7 (OPNsense & Bare-Metal Network Automation) tracking current progress on OPNsense Scores, codegen, and Brocade integration. Details the UpdateHostScore requirement and HostNetworkConfigurationScore rework needed for LAGG LACP 802.3ad. Add Phase 8 (HA OKD Production Deployment) describing the target architecture with LAGG/CARP/multi-WAN/BINAT and validation checklist. Update current state section to reflect opnsense-codegen branch progress.
57 lines
2.3 KiB
Markdown
57 lines
2.3 KiB
Markdown
# Phase 8: HA OKD Production Deployment
|
|
|
|
## Goal
|
|
|
|
Deploy a production HAClusterTopology OKD cluster in UPI mode with full LAGG LACP 802.3ad, CARP VIP, multi-WAN, and BINAT for customer traffic — entirely automated through Harmony Scores.
|
|
|
|
## Status: Not Started
|
|
|
|
## Prerequisites
|
|
|
|
- Phase 7 (OPNsense & Bare-Metal) substantially complete
|
|
- Brocade branch merged and adapted
|
|
- UpdateHostScore implemented and tested
|
|
|
|
## Deployment Stack
|
|
|
|
### Network Layer (OPNsense)
|
|
- **LAGG interfaces** (802.3ad LACP) for all cluster hosts — redundant links via LaggScore
|
|
- **CARP VIPs** for high availability — failover IPs via VipScore
|
|
- **Multi-WAN** configuration — multiple uplinks with gateway groups
|
|
- **BINAT** for customer-facing IPs — 1:1 NAT via BinatScore
|
|
- **Firewall rules** per-customer with proper source/dest filtering via FirewallRuleScore
|
|
- **Outbound NAT** for cluster egress via OutboundNatScore
|
|
|
|
### Switch Layer (Brocade)
|
|
- **VLAN** per network segment (management, cluster, customer, storage)
|
|
- **Port-channels** (LACP) matching OPNsense LAGG interfaces
|
|
- **Interface speed** configuration for 10G/40G links
|
|
|
|
### Host Layer
|
|
- **PXE boot** via UpdateHostScore (MAC → DHCP → TFTP → iPXE → SCOS)
|
|
- **Network bonds** (LACP) via reworked HostNetworkConfigurationScore
|
|
- **NMState** for persistent bond configuration on OpenShift nodes
|
|
|
|
### Cluster Layer
|
|
- OKD UPI installation via existing OKDSetup01-04 Scores
|
|
- HAProxy load balancer for API and ingress via LoadBalancerScore
|
|
- DNS via OKDDnsScore
|
|
- Monitoring via NodeExporterScore + Prometheus stack
|
|
|
|
## New Scores Needed
|
|
|
|
1. **UpdateHostScore** — Update MAC in DHCP, configure PXE boot, prepare host network for LAGG LACP
|
|
2. **MultiWanScore** — Configure OPNsense gateway groups for multi-WAN failover
|
|
3. **CustomerBinatScore** (optional) — Higher-level Score combining BinatScore + FirewallRuleScore + DnatScore per customer
|
|
|
|
## Validation Checklist
|
|
|
|
- [ ] All hosts PXE boot successfully after MAC update
|
|
- [ ] LAGG/LACP active on all host links (verify via `teamdctl` or `nmcli`)
|
|
- [ ] CARP VIPs fail over within expected time window
|
|
- [ ] BINAT customers reachable from external networks
|
|
- [ ] Multi-WAN failover tested (pull one uplink, verify traffic shifts)
|
|
- [ ] Full OKD installation completes end-to-end
|
|
- [ ] Cluster API accessible via CARP VIP
|
|
- [ ] Customer workloads routable via BINAT
|