All checks were successful
Run Check Script / check (pull_request) Successful in 2m25s
The IoT vocabulary was anchoring the codebase to one customer's
domain. The reconciler pattern is generic — operator in k8s, NATS
KV as desired-state bus, agents reconciling podman / OKD / KVM /
anything that can register. "Fleet" captures that neutrally; IoT
stays acknowledged in docs as the first customer use case.
Done now, while nothing is deployed. After a partner fleet lands,
changing the CRD group alone is a multi-quarter migration.
Scope (nothing left over):
Paths + crates
- iot/ → fleet/
- iot/iot-operator-v0 → fleet/harmony-fleet-operator
- iot/iot-agent-v0 → fleet/harmony-fleet-agent
- harmony/src/modules/iot → harmony/src/modules/fleet
- ROADMAP/iot_platform → ROADMAP/fleet_platform
- examples/iot_{vm_setup, load_test, nats_install} → examples/fleet_*
- -v0 suffix dropped on the operator + agent crates (semver in
Cargo.toml already tracks version)
Rust identifiers
- enum IotScore (podman score payload) → ReconcileScore
- struct IotDeviceSetupScore/Config → FleetDeviceSetupScore/Config
- InterpretName::IotDeviceSetup → InterpretName::FleetDeviceSetup
- HarmonyIotPool → HarmonyFleetPool (libvirt pool)
- HARMONY_IOT_POOL_NAME (default "harmony-iot") → HARMONY_FLEET_POOL_NAME ("harmony-fleet")
- IotSshKeypair → FleetSshKeypair
- ensure_iot_ssh_keypair / ensure_harmony_iot_pool /
check_iot_smoke_preflight_for_arch → fleet-prefixed variants
Wire / config surfaces
- CRD group `iot.nationtech.io` → `fleet.nationtech.io`
- Finalizer `iot.nationtech.io/finalizer` → `fleet.nationtech.io/finalizer`
- Shortnames iotdep/iotdevice → fleetdep/fleetdev
- Env var IOT_AGENT_CONFIG → FLEET_AGENT_CONFIG
- Env var IOT_VM_ADMIN_PASSWORD → FLEET_VM_ADMIN_PASSWORD
- Binary /usr/local/bin/iot-agent → /usr/local/bin/fleet-agent
- Systemd user `iot-agent` → `fleet-agent`
- VM admin user `iot-admin` → `fleet-admin`
Defaults
- Namespaces iot-system/iot-demo/iot-load → fleet-system/fleet-demo/fleet-load
- Helm release iot-nats → fleet-nats
- Helm release iot-operator-v0 → harmony-fleet-operator
- Container image localhost/iot-operator-v0:latest →
localhost/harmony-fleet-operator:latest
- On-disk cache $HARMONY_DATA_DIR/iot/ → $HARMONY_DATA_DIR/fleet/
(cloud-images, ssh keypairs, libvirt pool)
What stayed
- harmony-reconciler-contracts — already neutrally named
- Wire types (DeviceInfo, DeploymentState, HeartbeatPayload,
DeploymentName) — already neutral
- KV buckets (device-info, device-state, device-heartbeat,
desired-state) — already neutral
- CRD kind names (Deployment, Device) — already neutral
- NatsBasicScore / NatsHelmChartScore / HelmChart / etc. —
framework-scope, unchanged
Verification
- cargo check --workspace --all-targets: clean
- All harmony lib tests (114), fleet-operator (6), fleet-agent
(7), harmony-reconciler-contracts (13): green
- End-to-end load-test (20 devices / 3 CRs / 20s under
fleet/scripts/load-test.sh): PASS. Image built as
localhost/harmony-fleet-operator:latest, chart installed as
release harmony-fleet-operator in namespace fleet-system,
all CR aggregates correct.
Zero stragglers: grep across the tree for \biot\b / IOT_ /
\bIot[A-Z] returns empty (excluding docs explicitly talking about
IoT as the first customer's domain).
117 lines
6.8 KiB
Markdown
117 lines
6.8 KiB
Markdown
# Phase 12: Code Review Items (April 2026)
|
|
|
|
Items identified during the `feat/opnsense-codegen` PR review that require further design or cross-cutting work.
|
|
|
|
## Completed in this PR
|
|
|
|
- **1.1** Remove panic in `haproxy_service_to_harmony` — returns `None` with `warn!()` instead of panicking on invalid bind address
|
|
- **1.2** Use `MacAddress` type from `harmony_types` in KVM module — replaced `String` MAC fields in `VmInterface`, `NetworkRef`, `DhcpHost`, and `set_interface_link`
|
|
- **1.3** Compare both firewalls in `FirewallPairTopology::list_static_mappings` — warns on mismatch between primary and backup
|
|
- **1.4** Remove no-op default for `LoadBalancer::ensure_wan_access` — now a required trait method
|
|
- **2.1** Remove `wan_firewall_ports` from `LoadBalancerScore` — callers handle WAN access separately
|
|
- **2.2** Add timeout to OKD bootstrap wait — 90min default, configurable via `HARMONY_OKD_BOOTSTRAP_TIMEOUT_MINUTES`
|
|
|
|
## Tasks (deferred)
|
|
|
|
### 12.1 Phased topology: LinuxHostTopology → KvmHostTopology
|
|
|
|
**Priority**: HIGH
|
|
**Status**: Not started
|
|
**Related**: Phase 6 (KVM E2E tests)
|
|
|
|
The `examples/opnsense_vm_integration/setup-libvirt.sh` shell script should be a Score using a phased topology approach. A `LinuxHostTopology` would be "promoted" to a `KvmHostTopology` after KVM packages are installed and libvirtd is running.
|
|
|
|
Key design challenges:
|
|
- Type-safe phase transitions (how does a topology gain new capabilities at runtime?)
|
|
- Package installation as a Score (distro-agnostic or trait-based)
|
|
- Service management (systemd enable/start) as a Score primitive
|
|
|
|
This is a major architectural feature that enables the full bare-metal-to-VM pipeline without shell scripts.
|
|
|
|
### 12.2 KvmHost validated type with compile-time macro
|
|
|
|
**Priority**: MEDIUM
|
|
**Status**: Not started
|
|
**Related**: 12.1
|
|
|
|
`KvmConnectionUri::RemoteSsh { host: String, username: String }` should become a validated `KvmHost` type with:
|
|
- A `kvm_host!("root@hypervisor1")` macro for compile-time validation
|
|
- Proper SSH URI parsing and validation
|
|
- Integration with the phased topology (12.1)
|
|
|
|
### 12.3 Unified directory module
|
|
|
|
**Priority**: LOW
|
|
**Status**: Not started
|
|
**Related**: Phase 9 (SSO + Config Hardening)
|
|
|
|
Currently three different directory patterns exist:
|
|
- `HARMONY_DATA_DIR` in `harmony/src/domain/config/mod.rs` (lazy_static, `BaseDirs`)
|
|
- `harmony_config` uses `ProjectDirs::from("io", "NationTech", "Harmony")`
|
|
- `harmony_secret` uses `BaseDirs::data_dir().join("harmony")`
|
|
- `openbao/setup.rs` has its own `keys_dir()` function
|
|
|
|
Unify into a single `harmony_dirs` module providing: `data_dir()`, `cache_dir()`, `secrets_dir()`, `keys_dir(namespace)`.
|
|
|
|
### 12.4 OpenBao unseal key storage — bootstrap secret management
|
|
|
|
**Priority**: MEDIUM
|
|
**Status**: Research needed
|
|
**Related**: Phase 9 (SSO + Config Hardening), task 9.8 (auto-unseal)
|
|
|
|
The chicken-and-egg problem: OpenBao needs to be initialized before it can be used as a secret store, but its unseal keys need to be stored somewhere. Current approach stores them as a local JSON file with 0600 permissions.
|
|
|
|
Industry solutions to evaluate:
|
|
- Upstream OpenBao/Vault storing downstream seal keys (transit auto-unseal)
|
|
- HSM-backed auto-unseal (cloud KMS or on-prem HSM)
|
|
- TPM-based local encryption
|
|
- Shamir-split recovery with multiple administrators
|
|
- TOTP-based vault (mentioned in review)
|
|
|
|
No perfect solution exists. This requires threat modeling specific to the decentralized micro-datacenter use case.
|
|
|
|
### 12.5 Use `vaultrs` crate for type-safe OpenBao provisioning
|
|
|
|
**Priority**: MEDIUM
|
|
**Status**: Not started
|
|
**Related**: Phase 9
|
|
|
|
Replace `kubectl exec bao ...` shell commands in `openbao/setup.rs` with typed `vaultrs` API calls. The `vaultrs` 0.7.4 crate (already a dependency in `harmony_secret`) provides full coverage:
|
|
|
|
| Current shell command | vaultrs equivalent |
|
|
|---|---|
|
|
| `bao operator init` | `vaultrs::sys::start_initialization()` |
|
|
| `bao operator unseal` | `vaultrs::sys::unseal()` |
|
|
| `bao secrets enable kv-v2` | `vaultrs::sys::mount::enable()` |
|
|
| `bao auth enable userpass` | `vaultrs::sys::auth::enable()` |
|
|
| `bao policy write` | `vaultrs::sys::policy::set()` |
|
|
| `bao write auth/userpass/users/...` | `vaultrs::auth::userpass::user::set()` |
|
|
| `bao auth enable jwt` | `vaultrs::sys::auth::enable()` |
|
|
| JWT config + role | `vaultrs::auth::oidc::config::set()` + `role::set()` |
|
|
|
|
**Prerequisite**: Requires port-forward or ingress to OpenBao (currently uses `kubectl exec` into the pod). Consider adding a `K8sPortForward` utility to `harmony-k8s`.
|
|
|
|
### 12.6 Topology proliferation — opinionated topologies leaking into narrow use cases
|
|
|
|
**Priority**: MEDIUM
|
|
**Status**: Not started
|
|
**Related**: 12.1 (phased topology), `feat/install-reconcile-operator-score`
|
|
|
|
`K8sAnywhereTopology` and `HAClusterTopology` have accumulated opinions — cert-manager install, tenant manager setup, helm probes, TLS passthrough, SSO wiring — that make them unfit for narrow, ad-hoc Score execution. Calling `ensure_ready()` on `K8sAnywhereTopology` to apply a single CRD installs a full product stack as a side effect; that's the opposite of what "make me ready" should mean.
|
|
|
|
Concrete example: `fleet/harmony-fleet-operator/src/install.rs` needed a topology that satisfies `K8sclient` for a single `K8sResourceScore::<CustomResourceDefinition>` apply. `K8sAnywhereTopology` was wrong (too heavy); `HAClusterTopology` was wrong (bare-metal). Work-around: a 30-line inline `InstallTopology` that wraps a pre-built `K8sClient` and has a noop `ensure_ready`. That file flags the architectural smell in its doc comment and points back to this entry.
|
|
|
|
If every narrow Score ends up vendoring its own ad-hoc topology, we get exactly the proliferation this entry is meant to prevent.
|
|
|
|
**Design direction (to be refined, not prescribed):**
|
|
|
|
- A **minimal ad-hoc topology** in harmony — `K8sBareTopology` or similar — that carries a `K8sClient` and implements `K8sclient` + noop `ensure_ready`. One screen of code. Consumed by any Score that just needs to apply a typed resource against an existing cluster.
|
|
- Existing opinionated topologies (`K8sAnywhereTopology`) stay, but grow a clear doctrine: `ensure_ready` is for *their* product setup, callers who don't need that product use the bare topology.
|
|
- Longer-term: unbundle the product-setup logic from `K8sAnywhereTopology::ensure_ready` into discrete Scores the product compositions explicitly run — so the distinction between "I'm installing a cluster" and "I'm using a cluster" is a composition choice, not a topology choice.
|
|
|
|
**What "good" looks like:**
|
|
|
|
- Adding a new ad-hoc Score against k8s doesn't require inventing a new topology.
|
|
- `K8sAnywhereTopology` stops being the default reach and starts being a deliberate product choice.
|
|
- Test: can we delete the inline `InstallTopology` in `fleet/harmony-fleet-operator/src/install.rs` by replacing it with a one-liner `K8sBareTopology::from_env()`? That's the smoke test for "we fixed the proliferation."
|