Caller must pass `UserPassCredentials` to `FleetNatsScore::user_pass` — no more `e2e-admin`/`e2e-device` defaults shipped in the library. The deploy binary reads `HARMONY_FLEET_*` env vars (default namespace `harmony-fleet-system`) and fails fast when NATS creds aren't set. Also: `style/dist/` gitignored, `manual_mint/mint.py` moved next to `nats/callout/` with README + secrets gitignore (the real RSA key that was sitting untracked has been removed), `architecture_review.md` moved to `docs/adr/drafts/024-`, three low-value ROADMAP docs deleted. Updates pre-merge checklist (§1.6, §1.8, §3.1, §5).
887 lines
40 KiB
Markdown
887 lines
40 KiB
Markdown
# Fleet platform — architecture review
|
||
|
||
Working document for the architectural redesign of the fleet platform
|
||
before v0.1 ships to production. Started 2026-05-07.
|
||
|
||
This is a research + design document, not a plan to execute. The
|
||
output of this work is an ADR (or set of ADRs) that lock the new
|
||
shape; the v0.2 roadmap will reference whichever option we pick.
|
||
|
||
## Why now
|
||
|
||
- Three days from production. No customers depend on the API yet
|
||
→ API/UX/DX is still cheap to change. After ship, every breaking
|
||
change costs us a week of customer-coordination overhead.
|
||
- The `harmony/modules/fleet/` placement is wrong — already flagged
|
||
in code review. The reasons it ended up there are subtle (cross-
|
||
module imports of `K8sAnywhereTopology`, `HelmChartScore`,
|
||
`K8sResourceScore`, `harmony_secret`, `Topology` capability
|
||
traits). Those need to be written down before the file move,
|
||
not after.
|
||
- The plumbing — NATS + Zitadel + auth callout + operator + agent
|
||
— is sound. Highly secure, scalable by design, low resource
|
||
footprint. The redesign is about **moving code** and **better
|
||
data structures**, not rebuilding mechanisms.
|
||
- The frame from JG's *Pour l'amour des compilateurs* talk:
|
||
cardinality-matched types, "make impossible states impossible",
|
||
expressive types as the deterministic feedback loop that scales
|
||
with LLM-era code generation throughput. Apply that frame here.
|
||
|
||
## Working plan
|
||
|
||
1. **Inventory.** Map every public type, trait, score, module, and
|
||
crate that participates in the fleet domain. Markdown-bullet
|
||
shape; no diagrams.
|
||
2. **Read the room.** Pull principles from JG's talk, its
|
||
references, and harmony's existing ADRs (002 hexagonal, 003
|
||
infrastructure abstractions, 015 higher-order topologies, 016
|
||
harmony agent + global mesh, 017 NATS interconnection, 018
|
||
template hydration). Note where the existing fleet design
|
||
already follows them and where it doesn't.
|
||
3. **Identify the design problems.** Not bugs — *shape* problems.
|
||
Cardinality mismatches, leaky boundaries, "is this resolved
|
||
yet" branches, location/dependency loops.
|
||
4. **Sketch alternatives.** Three to five. At least one
|
||
conventional cleanup, at least one out-of-the-box that
|
||
reframes the domain. Compare on the same axes (cardinality,
|
||
placement, ergonomics, extensibility).
|
||
5. **Pick (or recommend) one.** Land as ADR.
|
||
|
||
This document covers steps 1–4. The pick happens in conversation
|
||
with JG before the ADR.
|
||
|
||
---
|
||
|
||
## §1 — Current state inventory
|
||
|
||
### §1.1 — Where the code lives
|
||
|
||
The fleet domain spans **three concerns** that today live in
|
||
**three locations**:
|
||
|
||
- **Framework-side scoring** (what runs on the operator's
|
||
workstation when they `cargo run` the install) → lives in
|
||
`harmony/src/modules/fleet/`. This is the wrong home; it's the
|
||
thing this review is about moving.
|
||
- `mod.rs` — re-exports
|
||
- `assets.rs` — Ubuntu/Debian cloud image fetchers, libvirt SSH
|
||
keypair management
|
||
- `libvirt_pool.rs` — libvirt storage pool bring-up
|
||
- `setup_score.rs` (1053 LOC, the monster) — `FleetDeviceSetupScore`,
|
||
`FleetDeviceSetupConfig`, `FleetDeviceAuth`
|
||
(TomlShared|ZitadelJwt|ZitadelEnroll), `AdminAuth`, `HostsEntry`,
|
||
`merge_hosts_file`
|
||
- `vm_score.rs` — `ProvisionVmScore` (libvirt VM bring-up)
|
||
- `preflight.rs` — `check_fleet_smoke_preflight*` (host system
|
||
checks)
|
||
- `server.rs` — `FleetServerScore`, `FleetServerInterpret`
|
||
(composed bring-up of Zitadel + NATS + callout + operator)
|
||
- `operator/`
|
||
- `mod.rs`, `score.rs` — `FleetOperatorScore`,
|
||
`FleetOperatorInterpret` (operator helm install)
|
||
- `chart.rs` (453 LOC) — chart rendering (`ChartOptions`,
|
||
`OperatorCredentials`, `build_chart`, `operator_secret`,
|
||
`build_operator_deployment`, `build_cluster_role`)
|
||
- `crd.rs` — `Deployment` CRD type (`DeploymentSpec`,
|
||
`Rollout`, `RolloutStrategy`, `DeploymentStatus`,
|
||
`DeploymentAggregate`, `AggregateLastError`); `Device` CRD type
|
||
(`DeviceSpec`)
|
||
- **Cross-boundary wire types** (the "contract" agent and operator
|
||
both have to agree on) → lives in `harmony-reconciler-contracts/`.
|
||
- `fleet.rs` — `DeviceInfo`, `DeploymentState`, `HeartbeatPayload`,
|
||
`DeploymentName`, `InvalidDeploymentName`
|
||
- `kv.rs` — bucket name constants + key-builder functions
|
||
- `status.rs` — `Phase`, `InventorySnapshot`
|
||
- re-exports `harmony_types::id::Id`
|
||
- **Runtime binaries** (what runs in the cluster + on devices) →
|
||
lives in `fleet/`.
|
||
- `harmony-fleet-operator/` — the operator pod. `controller.rs`,
|
||
`device_reconciler.rs`, `fleet_aggregator.rs` (833 LOC),
|
||
`install.rs`, `main.rs`. Pulls `Deployment`/`Device` CRDs from
|
||
`harmony::modules::fleet::operator::crd` (cross-crate import
|
||
that should give us pause).
|
||
- `harmony-fleet-agent/` — the on-device daemon. `config.rs`,
|
||
`reconciler.rs`, `fleet_publisher.rs`, `main.rs`.
|
||
- `harmony-fleet-auth/` — JWT-bearer / NATS-credentials helpers
|
||
used by both the operator AND the agent. `config.rs`,
|
||
`credentials.rs` (553 LOC). Sits between contracts and the
|
||
runtime crates.
|
||
|
||
### §1.2 — Public types, sorted by domain meaning (not location)
|
||
|
||
#### Identity & devices
|
||
|
||
- `harmony_types::id::Id` — opaque, sortable, collision-safe
|
||
identifier. Used as device id, deployment id, …
|
||
- `DeploymentName` (newtype with validation, `harmony-reconciler-contracts`)
|
||
- `DeviceInfo` — heartbeat payload that materializes into a
|
||
`Device` CR
|
||
- `DeviceSpec` — kube CRD, holds an optional `InventorySnapshot`
|
||
- `InventorySnapshot` — hardware/OS facts published once at
|
||
registration
|
||
|
||
#### Deployment desired-state
|
||
|
||
- `DeploymentSpec` — kube CRD: `target_selector: LabelSelector`,
|
||
`score: ReconcileScore`, `rollout: Rollout`
|
||
- `ReconcileScore` (in `harmony::modules::podman`, re-exported
|
||
from `harmony::modules::fleet::operator::crd`) — externally-tagged
|
||
enum, today only `PodmanV0(PodmanV0Score)`
|
||
- `PodmanV0Score`, `PodmanService`, `EnvVar`, `VolumeMount`,
|
||
`RestartPolicy`
|
||
- `Rollout`, `RolloutStrategy::Immediate`
|
||
|
||
#### Deployment observed-state
|
||
|
||
- `DeploymentState` — what the agent publishes per device per
|
||
deployment after reconcile
|
||
- `DeploymentStatus` (kube CRD) — operator-side rollup of all
|
||
device states for one Deployment CR
|
||
- `DeploymentAggregate` — counts (matched, succeeded, failed,
|
||
pending) + `last_error: Option<AggregateLastError>`
|
||
- `Phase` — `Pending | Running | Failed`
|
||
|
||
#### Authentication / identity provider
|
||
|
||
- `FleetDeviceAuth` — sum type with `TomlShared | ZitadelJwt |
|
||
ZitadelEnroll`. **The `ZitadelEnroll` arm carries
|
||
unresolved-state — admin credentials that must be turned into a
|
||
device JSON key at execute time. Mixes resolved and unresolved
|
||
states in one type, which is the cardinality bug we keep hitting.**
|
||
- `AdminAuth` — `Sso { client_id } | Token(String)` (used inside
|
||
`ZitadelEnroll`)
|
||
- `CredentialsSection` — TOML-on-disk shape (in
|
||
`harmony-fleet-auth`, parallel to `FleetDeviceAuth`)
|
||
- `CredentialSource` — runtime credential factory
|
||
- `NatsCredential` — what async-nats actually consumes
|
||
- `MachineKeyFile`, `CachedToken`
|
||
|
||
#### Setup procedures (Scores)
|
||
|
||
- `FleetDeviceSetupScore` (`FleetDeviceSetupConfig`) — the workhorse:
|
||
installs podman, drops the agent binary, drops the credentials
|
||
TOML, drops the keyfile, brings up the systemd unit.
|
||
- `FleetServerScore` — orchestrates Zitadel install + identity
|
||
setup + NATS install + callout install + operator install. Wraps
|
||
five other scores.
|
||
- `FleetOperatorScore` — operator helm chart render + install + the
|
||
credentials Secret apply.
|
||
- `ProvisionVmScore` — libvirt VM bring-up. Used by VM rehearsals.
|
||
- (External, not in fleet/) `ZitadelScore`, `ZitadelSetupScore`,
|
||
`NatsK8sScore`, `NatsAuthCalloutScore` — all consumed by the
|
||
composed install.
|
||
|
||
#### Operator-internal types
|
||
|
||
- `FleetState`, `SharedFleetState`, `DeploymentKey`, `DevicePair`,
|
||
`CachedDeployment`, `Context`, `Error` (the controller's local
|
||
error type), `selector_matches`, `apply_state`, `drop_state`,
|
||
`compute_aggregate`
|
||
|
||
#### Agent-internal types
|
||
|
||
- `AgentConfig`, `AgentSection`, `NatsSection`, `CredentialsSection`
|
||
- `FleetPublisher`, `Reconciler`
|
||
|
||
#### Fleet plumbing for development
|
||
|
||
- `FleetSshKeypair`, the cloud-image consts, `HarmonyFleetPool`,
|
||
`merge_hosts_file`, `HostsEntry`, `check_fleet_smoke_preflight*`
|
||
|
||
#### NATS subjects + KV buckets (the wire seam)
|
||
|
||
- `BUCKET_DESIRED_STATE` = `"desired-state"`
|
||
- `BUCKET_DEVICE_INFO` = `"device-info"`
|
||
- `BUCKET_DEVICE_STATE` = `"device-state"`
|
||
- `BUCKET_DEVICE_HEARTBEAT` = `"device-heartbeat"`
|
||
- Key builders: `desired_state_key(device_id, deployment_name)`,
|
||
`device_info_key(device_id)`, `device_state_key(device_id,
|
||
deployment_name)`, `device_heartbeat_key(device_id)`
|
||
|
||
### §1.3 — Concept clusters
|
||
|
||
When you squint at the inventory, the domain falls into **five
|
||
clusters**:
|
||
|
||
1. **Identity** — who is this device, who is this deployment, who
|
||
is the operator, what auth do they have.
|
||
2. **Desired state** — what should be running where.
|
||
3. **Observed state** — what is actually running where.
|
||
4. **Setup** — bringing all this into existence on a fresh
|
||
cluster + fresh device.
|
||
5. **Plumbing** — the NATS/kube/Zitadel mechanisms that make 1–4
|
||
work.
|
||
|
||
The current code does not cleanly separate these. Examples:
|
||
|
||
- `setup_score.rs` mixes **Setup** (drop binary, run systemd) with
|
||
**Identity** (`FleetDeviceAuth`). 1053 LOC.
|
||
- `FleetDeviceAuth` mixes resolved-Identity (`ZitadelJwt` —
|
||
here's a key) with Setup-time-Identity-resolution-intent
|
||
(`ZitadelEnroll` — here's how to mint a key).
|
||
- The chart-render helpers (`build_operator_deployment`, etc.) are
|
||
`pub` from `harmony::modules::fleet::operator::chart` so the
|
||
composed-install scores can pluck the secret out before helm
|
||
install. Plumbing leaking through Setup.
|
||
- `harmony::modules::fleet::operator::crd::DeploymentSpec` is the
|
||
CRD definition AND it's the type the operator daemon imports to
|
||
reconcile. Cross-crate import from a runtime crate
|
||
(`harmony-fleet-operator`) into a framework crate (`harmony`).
|
||
This is the placement bug.
|
||
|
||
### §1.4 — The shape problem in one diagram (text)
|
||
|
||
```
|
||
framework/operator workstation
|
||
│
|
||
harmony::modules::fleet ──┤ Scores: FleetServerScore, FleetDeviceSetupScore,
|
||
│ FleetOperatorScore, ProvisionVmScore
|
||
│ CRD types: Deployment, Device, DeploymentSpec, ...
|
||
│ Chart rendering helpers (operator/chart.rs)
|
||
│
|
||
harmony-reconciler-contracts ── wire types: DeviceInfo, DeploymentState,
|
||
│ HeartbeatPayload, KV constants
|
||
│ ▲ ▲
|
||
│ │ │
|
||
│ │ imports imports│
|
||
│ │ │
|
||
fleet/harmony-fleet-agent fleet/harmony-fleet-operator
|
||
▲ ▲
|
||
│ │
|
||
│ ALSO imports ALSO imports│
|
||
│ from harmony::modules:: from harmony::modules::
|
||
│ podman (PodmanV0Score) fleet::operator::crd
|
||
```
|
||
|
||
Two problematic edges:
|
||
|
||
1. `harmony-fleet-operator` imports `harmony::modules::fleet::operator::crd::Deployment`. The runtime daemon depends on the framework crate just for CRD type definitions.
|
||
2. `harmony-fleet-agent` imports `harmony::modules::podman::{PodmanV0Score, PodmanTopology, ReconcileScore}`. The agent depends on the framework crate's *podman module* for the score it deserializes off the wire.
|
||
|
||
Both edges should run *through* `harmony-reconciler-contracts`, not around it. That's the placement bug surfaced.
|
||
|
||
---
|
||
|
||
## §2 — Theory review
|
||
|
||
### §2.1 — From the talk
|
||
|
||
Pulling the load-bearing principles, ranked by relevance to this
|
||
redesign:
|
||
|
||
1. **Cardinality matters.** Types should match the cardinality of
|
||
the real-world concept. `&str` for "primary color" admits
|
||
infinite invalid inputs; `enum { Red, Yellow, Blue }` admits
|
||
exactly three. Friction is proportional to mismatch.
|
||
2. **Make impossible states impossible.** Don't comment the
|
||
constraint, code it. Push runtime errors to the design phase.
|
||
3. **Representations matter.** Same data, different shapes ↔
|
||
different operations are cheap. Roman numerals ↔ addition; Arabic
|
||
↔ multiplication. "An API is a computational representation of
|
||
real-world concepts."
|
||
4. **The compiler is a deterministic feedback channel.** In an era
|
||
when LLMs generate code at 5–10K LOC/day, the only sensor that
|
||
keeps up runs in milliseconds and is deterministic. Lean on it.
|
||
5. **Strong types reduce code volume + test boilerplate + token
|
||
waste + review burden + CI time + production incidents** — and
|
||
*increase* refactoring confidence and velocity-over-time. The
|
||
bet is asymmetric.
|
||
|
||
### §2.2 — From the references
|
||
|
||
Grouping by what they imply for *this* redesign:
|
||
|
||
#### Will Crichton — *Type-Driven API Design* + *Rust API Type Patterns*
|
||
|
||
- **Typestate.** Encode "phase of an operation" in the type
|
||
parameter. A `ProgressBar<Bounded>` exposes `.with_eta()`; a
|
||
`ProgressBar<Unbounded>` doesn't. The contradictory call doesn't
|
||
compile.
|
||
- Direct application: **`FleetDeviceAuth` mixes phases.** The
|
||
`ZitadelEnroll` arm is unresolved, the `ZitadelJwt` arm is
|
||
resolved, the `TomlShared` arm doesn't even need resolution. A
|
||
typestate would model these as distinct types; only one of them
|
||
has `agent.write_to_disk()`.
|
||
|
||
#### Richard Feldman — *Making Impossible States Impossible*
|
||
|
||
- Slogan-as-tool. Look at every `Option<T>` and ask *"can two of
|
||
these be inconsistent at once?"* If yes, that's an impossible
|
||
state — refactor.
|
||
- Direct application: `FleetDeviceSetupConfig` has `auth:
|
||
FleetDeviceAuth` AND `agent_binary_path: PathBuf`. Today nothing
|
||
prevents `auth = TomlShared` (no Zitadel) with
|
||
`agent_binary_path` pointing at the wrong-arch binary. We could
|
||
encode the agent binary's target arch as a typestate parameter
|
||
and refuse to deploy to a device with a known-different arch
|
||
inventory.
|
||
|
||
#### Sandy Maguire — *Protos Are Wrong*
|
||
|
||
- Protocol buffers throw away information real type systems
|
||
preserve. Sum types, exhaustiveness, parametric polymorphism,
|
||
Maybe/Result — protos can't express any of them precisely. The
|
||
"loose contract" sells you weak invariants.
|
||
- Direct application: `harmony-reconciler-contracts` is JSON-shaped
|
||
at the wire (matched on `type` tag for `ReconcileScore`).
|
||
We're already paying the proto-class tax: any new variant
|
||
requires both ends to know about it; the wire format doesn't
|
||
enforce a schema; old agents see new variants as parse errors.
|
||
This is an honest constraint — wire formats need to be permissive
|
||
by design — but it argues for keeping the **wire types small and
|
||
obviously evolvable** while letting in-memory types be
|
||
cardinality-matched.
|
||
|
||
#### Sean Goedecke — *Invalid States*
|
||
|
||
- The skeptic's case: making impossible states impossible *can be
|
||
over-applied*. Sometimes a `String` is the right cardinality
|
||
even when an enum exists, because the enum binds you to a
|
||
closed world.
|
||
- Direct application: **Don't make `device_id` a closed enum.**
|
||
The newtype + RFC1123 validation we just added is the right
|
||
cardinality match: it's a string-like, but only valid strings.
|
||
Over-modeling would have us build `enum DeviceId {
|
||
Pi(PiSerial), Vm(VmName), …}` — closed world, breaks first time
|
||
a customer plugs in an x86 box.
|
||
- Useful guardrail: **type-driven** ≠ **type-everything**. The
|
||
question to ask each time is "what's the cardinality of this
|
||
concept in reality" — not "can I model this".
|
||
|
||
#### Martin Fowler — *Harness Engineering* (April 2026)
|
||
|
||
- Computational sensors (compilers, type checkers, linters) over
|
||
inferential ones (tests, code review). Compiler runs on every
|
||
change; tests don't.
|
||
- Direct application: prefer compiler-checked invariants over
|
||
doc-comment invariants. If the docs say "this Score's `auth`
|
||
field must be resolved at the call site of `execute()`", the
|
||
compiler should enforce it.
|
||
|
||
### §2.3 — From harmony's own ADRs
|
||
|
||
Reading the existing ADRs *as design language already in use* —
|
||
what vocabulary should the new fleet shape stay consistent with?
|
||
|
||
#### ADR-002 (hexagonal architecture)
|
||
|
||
- "Domain isolated from adapters." Domain types own the
|
||
vocabulary; adapters (k8s client, NATS, helm) translate at the
|
||
edge.
|
||
- **Implication for fleet:** the *domain* is identity + desired
|
||
state + observed state. The *adapters* are NATS-KV, kube-CRD,
|
||
helm-chart, ansible-over-SSH. The current
|
||
`harmony::modules::fleet` mixes both. Pulling adapters out is the
|
||
refactor.
|
||
|
||
#### ADR-003 (infrastructure abstractions)
|
||
|
||
- "Abstractions at domain level, not provider level. `DnsServer`
|
||
not `OPNsenseDns`."
|
||
- **Implication for fleet:** capability traits like
|
||
`DeviceRegistry`, `DesiredStatePublisher`, `ObservedStateConsumer`
|
||
— each a standard infrastructure need that NATS-KV happens to
|
||
fulfill today, that another transport (gRPC streaming, MQTT,
|
||
Redis streams) could fulfill tomorrow.
|
||
|
||
#### ADR-015 (higher-order topologies)
|
||
|
||
- Higher-order topologies (`FailoverTopology<T>`,
|
||
`DecentralizedTopology<T>`) compose via blanket trait impls.
|
||
`T: PostgreSQL` ⇒ `FailoverTopology<T>: PostgreSQL`. Zero
|
||
boilerplate.
|
||
- **Implication for fleet:** `FleetTopology<T>` could compose with
|
||
a base `K8sTopology<T>` rather than being a parallel concept.
|
||
"A fleet is a thing that is *both* a kube cluster *and* a
|
||
device registry."
|
||
|
||
#### ADR-016 (Harmony Agent + Global Mesh)
|
||
|
||
- Agents are processes that observe + reconcile per a desired
|
||
state published into a NATS mesh. Mesh is the reliable hop;
|
||
agents are stateless processors at the edge.
|
||
- **Implication for fleet:** the IoT fleet is a *specialization*
|
||
of the agent + mesh ADR — devices are agents, the operator is
|
||
a coordinator. The fleet domain types should fit ADR-016's
|
||
vocabulary, not invent a parallel one.
|
||
|
||
#### ADR-017 (NATS clusters interconnection)
|
||
|
||
- Trust topology: per-cluster account isolation, gateway-mediated
|
||
cross-cluster traffic. Per-device permissions are a
|
||
specialization of per-account.
|
||
- **Implication for fleet:** the auth callout's per-device permission
|
||
templates should compose with the cluster-interconnection
|
||
account model — currently they're treated as orthogonal, which
|
||
is fine until we actually cross fleets.
|
||
|
||
#### ADR-018 (template hydration)
|
||
|
||
- Hydrating templates at the edge of the framework, not in the
|
||
middle. Same pattern as our generated chart YAML: render once,
|
||
apply via typed code.
|
||
- **Implication for fleet:** chart-rendering helpers
|
||
(`build_operator_deployment` et al.) are template-hydration
|
||
edges. They *should* be hidden from domain code. Today they're
|
||
`pub` — visible to consumers like `fleet_staging_install` who
|
||
reach in and grab `operator_secret(opts)`. That's adapter
|
||
leakage.
|
||
|
||
### §2.4 — Synthesis: principles for the redesign
|
||
|
||
A short list, ordered. Each line is something the new shape
|
||
should satisfy:
|
||
|
||
1. **Domain types in `harmony-reconciler-contracts` (or a sibling
|
||
crate)**, with no dependency on `harmony` framework types.
|
||
2. **Resolved types only at the API surface.** Pre-resolution
|
||
intent is a separate type, used only by the resolver.
|
||
3. **Capabilities as traits**, not concrete types. `DeviceRegistry`,
|
||
`DesiredStatePublisher`, etc. The NATS-backed impl is one of
|
||
several allowed.
|
||
4. **Closed cardinality where reality is closed; open where reality
|
||
is open.** Goedecke's check, not Feldman's.
|
||
5. **Higher-order topology, not parallel topology.** A fleet is a
|
||
`FleetTopology<T>` over a base K8s topology, not a separate
|
||
capability hierarchy.
|
||
6. **Adapters hidden behind capabilities.** Helm chart rendering,
|
||
k8s resource apply, NATS subjects — none of these surface from
|
||
the fleet's public API.
|
||
7. **No yaml in framework code paths.** Existing principle from
|
||
v0_1; keep.
|
||
8. **Keep wire types minimal + permissive.** Not because they're
|
||
the canonical model, but because they're the
|
||
evolvability seam (Maguire's protos critique applies in
|
||
reverse — *embrace* the loose contract on the wire, *reject* it
|
||
in-memory).
|
||
|
||
---
|
||
|
||
## §3 — Design problems with the current shape
|
||
|
||
Concrete issues the redesign needs to fix. Not "bugs" — *shape*
|
||
problems. Each numbered so we can refer back when comparing
|
||
alternatives.
|
||
|
||
- **P1. `harmony/modules/fleet/` is in the wrong crate.** It pulls
|
||
framework dependencies (`HelmChartScore`, `K8sResourceScore`,
|
||
`K8sAnywhereTopology`, `harmony_secret`, etc.) and the runtime
|
||
daemons import *from it*. This makes the operator/agent depend
|
||
transitively on every harmony module — including the OPNsense
|
||
XML codegen, OKD bootstrap stuff, etc. Compile times suffer; the
|
||
release surface is wrong (you can't `cargo install
|
||
harmony-fleet-operator` without all of harmony).
|
||
- **P2. `FleetDeviceAuth` mixes resolved + unresolved states.**
|
||
`ZitadelEnroll` is pre-resolution intent; `ZitadelJwt` is
|
||
post-resolution credential. A single match arm has to handle
|
||
both. The "render TOML for both" hack we wrote works but is a
|
||
symptom — the TOML for an unresolved auth should be undefined,
|
||
not "same as resolved".
|
||
- **P3. `setup_score.rs` is 1053 LOC monolith.** Eight responsibilities
|
||
in one file: ssh-vs-local connection, ansible orchestration,
|
||
systemd unit text, hosts-file merging, podman package install,
|
||
fleet-agent user provisioning, keyfile writing, agent restart.
|
||
Readability is poor; testability is per-orchestration not
|
||
per-step.
|
||
- **P4. CRD types live in framework crate.** `Deployment` and
|
||
`Device` CRDs are defined in
|
||
`harmony::modules::fleet::operator::crd`. The runtime operator
|
||
crate (`harmony-fleet-operator`) imports them from there. This
|
||
is the most visible symptom of P1.
|
||
- **P5. `ReconcileScore` polymorphism is anemic.** Today there's
|
||
exactly one variant, `PodmanV0`. The wire format is set up for
|
||
evolution but no second variant exists, and the cross-crate
|
||
import from `harmony::modules::podman` makes adding one
|
||
expensive (re-export dance).
|
||
- **P6. Adapter leakage from chart rendering.**
|
||
`build_operator_deployment`, `operator_secret`, `build_chart`
|
||
are `pub`. Consumers in `examples/` reach in to compose helm
|
||
releases by hand. Domain code should not see "what does the
|
||
operator's helm chart look like".
|
||
- **P7. Composed scores wrap composed scores wrap composed scores.**
|
||
`FleetServerScore` wraps {ZitadelScore, ZitadelSetupScore,
|
||
NatsK8sScore, NatsAuthCalloutScore, FleetOperatorScore}. Each
|
||
of those does its own k8s resource apply + helm install.
|
||
Failure modes are deep: a problem in one score's interpret
|
||
surfaces wrapped through five layers of "context()". Hard to
|
||
debug; hard to reason about ordering.
|
||
- **P8. Topology assumptions are everywhere.** Every `Score`
|
||
bound is a hand-rolled union of capability traits — `T:
|
||
Topology + HelmCommand + K8sclient + TlsRouter + 'static`. Add
|
||
a new capability and every callsite has to be updated. Higher-
|
||
order topology composition (ADR-015) would let us name "a
|
||
thing that is a fleet-capable cluster" once.
|
||
- **P9. `Id` is overloaded.** Same type for device IDs, machine
|
||
user IDs, deployment IDs, topology names. Newtype-ing each
|
||
would catch arg-order swaps at compile time.
|
||
- **P10. Configuration is a staircase.** Operator workstation has
|
||
`ZitadelClientConfig` cache file. Operator pod has env-var-from-
|
||
Secret. Agent has TOML on disk. Three different shapes for
|
||
fundamentally the same data (issuer URL, audience, key
|
||
material). Maguire's protos critique applies internally — we're
|
||
using *several* loose-contract serializations of the same
|
||
domain object.
|
||
|
||
---
|
||
|
||
## §4 — Design alternatives
|
||
|
||
Five sketches. The first three are increasingly principled
|
||
cleanups; the last two are deliberately weird, included to force
|
||
us to recognize where the *core* of the domain actually is.
|
||
|
||
For each: one paragraph of premise, the resulting top-level types,
|
||
how it answers each of P1–P10 (✓ / ✗ / partial), and the
|
||
honest pros + cons.
|
||
|
||
### Alternative A — Move + thin façade (the conservative cleanup)
|
||
|
||
**Premise:** the existing types are mostly right; the location is
|
||
wrong and the façade leaks. Move `harmony/modules/fleet/` to
|
||
`fleet/harmony-fleet/`. Re-export only what's intended public.
|
||
Don't redesign types.
|
||
|
||
**Top-level types:** unchanged. `FleetDeviceSetupScore`,
|
||
`FleetServerScore`, `FleetOperatorScore`, `FleetDeviceAuth`,
|
||
`AdminAuth`, `Deployment` CRD, `Device` CRD. Same shapes, new
|
||
location.
|
||
|
||
**P1 ✓** (location fix is the goal). **P2 ✗** (auth still mixes
|
||
resolved/unresolved). **P3 ✗** (monolith preserved). **P4 ✓**
|
||
(CRDs co-located with operator). **P5 ✗**. **P6 partial** (we
|
||
can `pub(crate)` the chart helpers but the underlying coupling
|
||
remains). **P7 ✗**. **P8 ✗**. **P9 ✗**. **P10 ✗**.
|
||
|
||
**Pros:** small, safe, mechanical. Two days of work. No customer-
|
||
visible breakage. Unblocks P4 cleanup naturally.
|
||
|
||
**Cons:** doesn't actually fix the shape. We'd be back here in
|
||
six weeks. JG's review already said this isn't enough. Not the
|
||
right answer for v0.1 timing — *would* be the right answer if
|
||
we'd already shipped to two customers and couldn't break their
|
||
code.
|
||
|
||
### Alternative B — Resolved-only at boundaries + capability traits (the principled cleanup)
|
||
|
||
**Premise:** Crichton's typestate + ADR-003's domain capabilities
|
||
applied to the existing shape. Split resolved vs. unresolved
|
||
auth into separate types. Define capability traits for the
|
||
adapters. Move into the right crate. **No wholesale rewrite.**
|
||
|
||
**Top-level types:**
|
||
|
||
- New crate `harmony-fleet/` (sibling to `harmony-fleet-operator`,
|
||
-agent, -auth). Domain types live here.
|
||
- `FleetIdentity`, `FleetDevice`, `FleetDeployment` — domain
|
||
records. Plain data.
|
||
- `DeviceCredential` — *resolved* only (a JSON keyfile + issuer
|
||
URL + audience). Replaces `FleetDeviceAuth::ZitadelJwt`.
|
||
- `EnrollmentIntent` — pre-resolution. Carries `AdminAuth` and
|
||
what to mint. Method `resolve(&self) -> Result<DeviceCredential>`.
|
||
- `Score`s become small + single-responsibility:
|
||
- `EnrollDeviceScore` — runs `EnrollmentIntent::resolve` then
|
||
publishes to NATS.
|
||
- `InstallAgentScore` — drops binary + config + systemd unit.
|
||
Takes a `DeviceCredential`. Doesn't know about Zitadel.
|
||
- `InstallOperatorScore` — helm chart + Secret. Doesn't know
|
||
about devices.
|
||
- `BringUpFleetScore` — composes the above. Single layer of
|
||
composition, not five.
|
||
- Capability traits:
|
||
- `DeviceRegistry` — list/get/upsert/delete a `FleetDevice`.
|
||
Implementations: `NatsKvDeviceRegistry`,
|
||
(later) `RedisStreamsDeviceRegistry`.
|
||
- `DesiredStatePublisher`, `ObservedStateConsumer` — same
|
||
shape.
|
||
- `IdentityProvider` — mint a device credential, issue an
|
||
admin token. Today: Zitadel. Tomorrow: something else.
|
||
|
||
**P1 ✓ P2 ✓ P3 ✓** (split into 4–5 small Scores). **P4 ✓ P5 ✓**
|
||
(resolve in the runtime crate, contracts stay neutral).
|
||
**P6 ✓** (chart helpers `pub(crate)`, surfaced via `IdentityProvider`
|
||
+ `DeploymentReleaseManager` traits). **P7 ✓** (one composer,
|
||
not five). **P8 partial** (capability traits defined but bound
|
||
unions still get long). **P9 ✓** with newtypes. **P10 partial**
|
||
(still three on-disk shapes for credentials, but unified by
|
||
trait).
|
||
|
||
**Pros:** highest-leverage incremental redesign. Buys us most of
|
||
the principles without rebuilding plumbing. Customer-visible
|
||
breakage is contained to public API renames + import path
|
||
moves — no behavior change. Three days is realistic.
|
||
|
||
**Cons:** we still have a `Score`-shaped mental model where the
|
||
*unit of execution* is "a Score". If the right primitive turns
|
||
out to be smaller (an effect, an event, a capability call), this
|
||
choice wastes some leverage.
|
||
|
||
### Alternative C — The dataflow reframe (events in, state out)
|
||
|
||
**Premise:** the fleet platform is, in essence, a **stream
|
||
processor**. Events flow in (heartbeats, intent CR creates,
|
||
agent reconcile reports). State materializes out (Device CRs,
|
||
DeploymentAggregate counters, KV desired-state writes). Today
|
||
we model it imperatively as a series of `Score`s; the dataflow
|
||
shape is fighting that.
|
||
|
||
**Top-level types:**
|
||
|
||
- `FleetEvent` — sum type. `DeviceHeartbeat | DeviceFirstSeen |
|
||
DeploymentDesired | DeploymentObserved | DeploymentDeleted | …`
|
||
- `FleetStateSnapshot` — what the operator currently knows. Pure
|
||
data, derivable.
|
||
- `Reducer` — `(state, event) → state`. Pure function. Tests
|
||
trivially.
|
||
- `Effect` — sum type of side-effects the reducer wants done:
|
||
`WriteKv(bucket, key, value) | UpsertCr(cr) | EmitMetric(...)`.
|
||
Reducer returns `(new_state, Vec<Effect>)`.
|
||
- `EffectRunner` — adapter that performs effects. The only thing
|
||
that touches NATS / kube. One implementation per environment.
|
||
- The operator pod's main loop: `for event in stream { (state,
|
||
effects) = reduce(state, event); runner.run_all(effects) }`.
|
||
~50 lines.
|
||
|
||
**P1 ✓ P2 ✓ P3 ✓ P4 ✓ P5 ✓ P6 ✓ P7 ✓ P8 ✓** (capabilities
|
||
collapse into the `EffectRunner` trait). **P9 ✓ P10 partial**.
|
||
|
||
**Pros:** dramatically simpler operator code. Reducer is pure →
|
||
property-test-friendly. The dataflow is the platform. Aligns
|
||
with how Kafka / Materialize / Flink-class systems are
|
||
structured. Easy to add a new event type — the compiler shows
|
||
you every reducer arm to update.
|
||
|
||
**Cons:** large rewrite of the operator. Three days is
|
||
unrealistic. The current `fleet_aggregator.rs` (833 LOC) already
|
||
roughly does this but in a less disciplined shape — maybe the
|
||
incremental version of this is "make `apply_state` a real
|
||
reducer and split `compute_aggregate` into pure pieces". That's
|
||
more like Alternative B with extra discipline. The full effect-
|
||
typed version is a nice end-state but not a sprint goal.
|
||
|
||
**Cite:** Materialize's dataflow paper; Kent Beck's *Augmented
|
||
Coding* on factoring; Gergely Orosz on event-sourcing; the talk's
|
||
"good Lego bricks" framing applies — *events* are the bricks.
|
||
|
||
### Alternative D — The fleet as a **kube control plane**, period (deliberately weird)
|
||
|
||
**Premise:** strip the design to one observation. **A fleet is a
|
||
Kubernetes cluster whose Nodes happen to be devices, not
|
||
servers.** Stop modelling Devices and Deployments separately
|
||
from kube primitives. Use Kubernetes itself as the data model.
|
||
The operator is one CRD reconciler. NATS is just the transport
|
||
between the API server (in the cluster) and the device-side
|
||
kubelet-equivalent.
|
||
|
||
**Top-level types:**
|
||
|
||
- `Device` is a Node CR. Already exists; we stop wrapping it.
|
||
- `Deployment` is a `DaemonSet` (one pod per matching node) or a
|
||
`Deployment` (count: N targeted nodes). We stop inventing a
|
||
CRD; we use the standard one.
|
||
- `DeviceInfo` is the Node's `.status` (capacity, allocatable,
|
||
conditions). We stop publishing parallel data; we update
|
||
Node status from the agent's NATS messages.
|
||
- The agent on the device is a custom kubelet that speaks NATS to
|
||
the operator instead of HTTPS to the API server.
|
||
- The auth callout still exists; it gates NATS access.
|
||
- No `harmony-fleet-operator`-specific CRDs. No `Deployment` /
|
||
`Device` CRs of our own.
|
||
|
||
**P1 ✓ P2 ✓ P3 ✓ P4 N/A** (no CRDs of our own to misplace).
|
||
**P5 ✓ P6 ✓ P7 ✓ P8 ✓ P9 ✓ P10 ✓**.
|
||
|
||
**Pros:** the simplest *conceptual* answer. We stop fighting kube
|
||
+ inventing parallel concepts. Customers already understand
|
||
DaemonSets, Node selectors, and `kubectl get nodes`. The agent
|
||
becomes a known kind of thing (a kubelet variant) with shoulders
|
||
to stand on (k3s-iot, kine, virtual-kubelet projects already
|
||
prove this works).
|
||
|
||
**Cons:** *a lot* of plumbing changes. Devices need to register
|
||
as Nodes (which means either a real kubelet on each Pi, or a
|
||
virtual-kubelet façade). The agent's reconcile loop becomes
|
||
"watch a CR via NATS, render manifests, run pods" — bigger than
|
||
"watch a KV value, run podman". JetStream KV becomes redundant
|
||
with the kube API server. **Probably the right end-state for
|
||
v2.0, wrong for v0.1.** Worth noting, though, because comparing
|
||
A/B/C to D pulls out which of our current invented concepts are
|
||
load-bearing (very few — DeviceInfo is mostly just Node.status;
|
||
DeploymentAggregate is mostly just kube's
|
||
.status.observedGeneration / .status.conditions stuff).
|
||
|
||
**Cite:** virtual-kubelet, k3s-iot, KubeEdge, OpenYurt. They've
|
||
walked this path; the lessons are public.
|
||
|
||
### Alternative E — Algebra of fleets (deliberately weird, mathematical)
|
||
|
||
**Premise:** model the platform as a small algebra. A fleet is a
|
||
**set of devices** + an **assignment function** (selector → set
|
||
of deployments). Operations on fleets are set-theoretic +
|
||
function composition. Treat the API as a query language over
|
||
this algebra.
|
||
|
||
**Top-level types:**
|
||
|
||
- `Fleet` ::= `Set<Device>`. With operations: union, intersection,
|
||
filter-by-selector, partition.
|
||
- `Selector` ::= a pure predicate `Device → bool`. Built from
|
||
primitives `label("k") = "v"`, `arch = aarch64`, …, combined
|
||
with `&`, `|`, `!`.
|
||
- `Assignment` ::= `Selector → Set<Deployment>`. Pure function.
|
||
- `World` ::= `(Fleet, Assignment)`. Pure data. The operator's job
|
||
is to make reality match the World.
|
||
- `Diff(World, Reality) → Vec<Action>`. Pure function. Closed
|
||
form — given the algebra, you can prove what actions are
|
||
*necessary* and *sufficient*.
|
||
|
||
**P1–P10 ✓** (in principle). **Code volume probably 30% of
|
||
current.**
|
||
|
||
**Pros:** clarity. Properties become provable: "no device gets
|
||
an unassigned deployment", "removing a label removes the
|
||
assignment", "two operators can edit independently and the merge
|
||
is well-defined" (because functions compose). The "make
|
||
impossible states impossible" principle, applied to the *fleet
|
||
shape itself*, not to individual types.
|
||
|
||
**Cons:** **almost certainly an over-fit.** The real platform has
|
||
dirty edges (devices that fail, network partitions, half-applied
|
||
state) that don't sit naturally in a pure algebra. Most teams
|
||
that go down this road end up bolting "real-world" escape hatches
|
||
back on, ending up with the original design plus extra category
|
||
theory. **Useful as a north star** for the cardinality choices,
|
||
**not as the platform's actual shape.**
|
||
|
||
**Cite:** Hillel Wayne *Using Formal Methods at Work*; Conal
|
||
Elliott on functional reactive programming; the classic "set
|
||
theory for systems people" talks.
|
||
|
||
### Comparison matrix
|
||
|
||
| | A. Move | B. Capabilities | C. Dataflow | D. Kube-native | E. Algebra |
|
||
|---|---|---|---|---|---|
|
||
| Fixes P1 (location) | ✓ | ✓ | ✓ | ✓ | ✓ |
|
||
| Fixes P2 (auth states) | ✗ | ✓ | ✓ | ✓ | ✓ |
|
||
| Fixes P3 (monolith) | ✗ | ✓ | ✓ | ✓ | ✓ |
|
||
| Fixes P4 (CRD placement) | ✓ | ✓ | ✓ | N/A | N/A |
|
||
| Fixes P5 (anemic enum) | ✗ | ✓ | ✓ | N/A | partial |
|
||
| Fixes P6 (adapter leak) | partial | ✓ | ✓ | ✓ | ✓ |
|
||
| Fixes P7 (deep wrap) | ✗ | ✓ | ✓ | ✓ | ✓ |
|
||
| Fixes P8 (trait union) | ✗ | partial | ✓ | ✓ | ✓ |
|
||
| Fixes P9 (Id overload) | ✗ | ✓ | ✓ | ✓ | ✓ |
|
||
| Fixes P10 (config staircase) | ✗ | partial | partial | ✓ | partial |
|
||
| Fits 3-day window | ✓ | ✓ (tight) | ✗ | ✗ | ✗ |
|
||
| Customer-visible breakage | low | medium | medium | very high | high |
|
||
| Risk to demo schedule | very low | low | medium | very high | high |
|
||
| Long-term ceiling | low | high | high | very high | very high |
|
||
|
||
---
|
||
|
||
## §5 — Recommendation (preliminary)
|
||
|
||
Read the matrix as: **B is the right answer for now**, with
|
||
**explicit awareness of D as the v2.0 destination**.
|
||
|
||
- A is too little. We'd be back here.
|
||
- C and E are right in shape but wrong in timing — we don't have a
|
||
week to rebuild the operator's reconcile loop, and the platform
|
||
isn't in production yet, so there's no urgent "we have to
|
||
refactor anyway" pressure.
|
||
- D is conceptually the cleanest, but a v0.1 production push
|
||
is the wrong moment to start running custom kubelets.
|
||
- B captures most of the leverage of C/D within the 3-day window,
|
||
with a clean migration path to either of them later (the
|
||
capability traits are the seam — swap the implementation, not the
|
||
callers).
|
||
|
||
**One concrete shape** to pursue under Alternative B (worth
|
||
sketching as the strawman ADR):
|
||
|
||
- New crate `harmony-fleet/` (the domain crate). Depends on
|
||
`harmony-reconciler-contracts` only.
|
||
- Domain records: `FleetDevice`, `FleetDeployment`, `FleetState`.
|
||
- Capability traits: `DeviceRegistry`, `DesiredStatePublisher`,
|
||
`ObservedStateConsumer`, `IdentityProvider`,
|
||
`AgentLifecycle`.
|
||
- `harmony-fleet-adapters-nats/` — `NatsDeviceRegistry`,
|
||
`NatsDesiredStatePublisher`, etc. NATS-specific.
|
||
- `harmony-fleet-adapters-zitadel/` — `ZitadelIdentityProvider`.
|
||
- `harmony-fleet-adapters-kube/` — `KubeFleetReflector` (writes
|
||
`Device` and `Deployment` CRs as a *reflection* of the domain
|
||
state, not as the source of truth).
|
||
- `harmony-fleet-operator/` — daemon. Wires adapters together.
|
||
- `harmony-fleet-agent/` — daemon. Wires adapters together.
|
||
- `harmony-fleet-cli/` — tomorrow's `harmony-fleet` plugin.
|
||
- `harmony/modules/fleet/` is **deleted**. The framework `harmony`
|
||
crate gets a thin `harmony::modules::fleet` *re-export only*
|
||
module that points at `harmony-fleet`. After v0.2 is shipped,
|
||
the re-export module goes away too.
|
||
|
||
CRDs (`Deployment`, `Device`) move to
|
||
`harmony-fleet-adapters-kube/` because they're a kube-specific
|
||
projection of the domain, not the domain itself. The agent
|
||
imports `harmony-fleet`'s domain types, not the CRDs.
|
||
|
||
The setup-side scores stay in `harmony` (because they need the
|
||
framework's `HelmCommand`, `K8sclient`, etc.) but they consume
|
||
`harmony-fleet`'s domain types. The fleet's *domain* doesn't
|
||
depend on the framework; the framework's *deploy procedures*
|
||
depend on the fleet's domain. Direction of dependency is the
|
||
inverse of today.
|
||
|
||
## §6 — Open questions before we lock this
|
||
|
||
These are real questions; pulling them out so JG's review has
|
||
something concrete to react to:
|
||
|
||
- **Q1.** Is `IdentityProvider` the right capability name, or is
|
||
it more honest to name it after what we actually need
|
||
(`DeviceCredentialMinter`, `OperatorTokenProvider`)? The talk
|
||
argues against generic names — if reality has two distinct
|
||
concerns, two traits.
|
||
- **Q2.** Should `Device` CRD live in adapters-kube, or should it
|
||
not exist at all (replaced by reading kube-API node info, per
|
||
alternative D)? The middle ground (own CRD that mirrors kube
|
||
Node) is what we have today, and it's the worst of both.
|
||
- **Q3.** The agent's wire-format for `ReconcileScore` —
|
||
externally tagged enum, today only `PodmanV0`. Move it to
|
||
`harmony-reconciler-contracts` (canonical wire seam) and let
|
||
*both* the agent and the operator import only that crate. This
|
||
removes the `harmony::modules::podman` cross-crate dependency.
|
||
Worth doing in any of A/B/C.
|
||
- **Q4.** Does the v0.1 prod push wait for this redesign, or does
|
||
it ship on the current shape with the redesign happening in
|
||
v0.2? Tradeoff: shipping now means committing to *some* public
|
||
API; shipping after means slipping the customer date.
|
||
Recommendation: **ship the redesign first, slip 3 days**, on
|
||
the grounds that public API churn after a customer is on it
|
||
costs more than a 3-day delay before they're on it.
|
||
- **Q5.** Where do the *runtime tools* (the `harmony-fleet` CLI
|
||
plugin, future frontend) sit in the dependency graph? If they
|
||
depend on `harmony-fleet`'s domain crate only, we can build
|
||
them without pulling in helm / kube / ansible at compile time.
|
||
This is what we want for the device-side enrollment binary too
|
||
(already feature-gated; the redesign should make the gate
|
||
unnecessary).
|
||
|
||
---
|
||
|
||
## §7 — Next steps
|
||
|
||
1. Sit with this document. Walk away from it for an hour.
|
||
2. Round-table on §3 — do P1–P10 capture *the* problems, or are
|
||
we missing one?
|
||
3. Round-table on §4 — does the comparison matrix feel honest,
|
||
or is it tilted?
|
||
4. Pick one alternative as the working hypothesis.
|
||
5. Spike: take one slice through the chosen alternative
|
||
(suggested: `EnrollmentIntent::resolve` + `DeviceCredential` +
|
||
the `IdentityProvider` trait — the smallest end-to-end shape
|
||
that touches every layer). Commit it on a branch. Eyeball:
|
||
does the resulting code feel better?
|
||
6. Either: commit to the alternative as ADR-023, or back out
|
||
and try another.
|
||
|
||
This document gets updated as we go. It is NOT meant to be
|
||
locked at first draft.
|