Files
harmony/docs/adr/024-fleet-platform-capability-decomposition.md
2026-05-20 12:03:19 -04:00

183 lines
7.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Architecture Decision Record: Fleet Platform — Capability Decomposition
Initial Author: Jean-Gabriel Gill-Couture (with research by Claude)
Initial Date: 2026-05-20
Last Updated Date: 2026-05-20
## Status
**Draft — under review.** Captures the proposed shape for review;
not yet locked. If accepted, supersedes the as-built layout of
`harmony/src/modules/fleet/` documented in ADR-023's first
revision.
## Context
The fleet platform shipped under `feat/iot-walking-skeleton`
spans three concerns that today share two locations:
1. **Domain logic** — what a `FleetDevice` is, what a
`FleetDeployment` looks like, what the reconciler-contracts
wire types mean.
2. **Adapters** — concrete NATS, Zitadel, Kubernetes, Helm
integrations that drive the domain.
3. **Deploy procedures** — how to bring up the operator, agent,
NATS, Zitadel as Scores against a Topology.
Today these live in `harmony/src/modules/fleet/` (mixed), the
`harmony-reconciler-contracts` crate (wire types only), the
`harmony-fleet-deploy` crate (Scores for deploy), and the
`harmony-fleet-operator`/`harmony-fleet-agent` binaries
(runtime). The boundary between domain and adapter is not
type-level: `harmony/src/modules/fleet/setup_score.rs` for
example reaches into Zitadel, NATS, Kube, and Helm directly.
Anyone wanting to swap NATS for a different transport would
touch every fleet file.
ADR-023 already addressed the *deploy*-side of this (deploy
Scores live in `*-deploy` crates, not in `harmony` core). This
ADR proposes the *domain*-side decomposition: pull a thin
fleet-domain crate above the existing reconciler-contracts, push
provider-specific code into adapter crates, and re-direct the
deploy crate to consume the domain rather than the framework
primitives directly.
## Decision (proposed)
Five crates, layered by dependency direction:
```
harmony-reconciler-contracts (existing — wire types only)
harmony-fleet-domain (new — domain records + capability traits)
harmony-fleet-adapters-* (new — one crate per provider)
▲ (nats, zitadel, kube)
harmony-fleet-deploy (existing — bring-up Scores)
harmony-fleet-operator (existing — daemon)
harmony-fleet-agent (existing — daemon)
```
### `harmony-fleet-domain`
The domain crate. Depends only on `harmony-reconciler-contracts`
and `harmony_types`. Holds:
- **Domain records**: `FleetDevice`, `FleetDeployment`,
`FleetState`, `EnrollmentIntent`, `DeviceCredential`.
- **Capability traits**: `DeviceRegistry`,
`DesiredStatePublisher`, `ObservedStateConsumer`,
`IdentityProvider`, `AgentLifecycle`. These are the seam
between domain logic and provider-specific implementations.
### `harmony-fleet-adapters-nats`, `-zitadel`, `-kube`
One crate per provider. Each implements the capability traits
above for its specific backend:
- `nats``NatsDeviceRegistry`, `NatsDesiredStatePublisher`,
`NatsObservedStateConsumer`.
- `zitadel``ZitadelIdentityProvider`, machine-user
provisioning, JWT-bearer minting.
- `kube``KubeFleetReflector` writes `Device` and
`Deployment` CRDs as a *reflection* of domain state, not as
the source of truth. CRD types move here from
`harmony-fleet-operator`.
### `harmony-fleet-deploy`
Stays as the home for `FleetOperatorScore`, `FleetAgentScore`,
`FleetNatsScore`, `FleetCalloutScore`. Updates: imports
`harmony-fleet-domain` for types, uses
`harmony-fleet-adapters-*` to compose Scores against capability
traits rather than reaching directly into NATS/Zitadel client
crates.
### Direction of dependency
The fleet *domain* doesn't depend on the framework. The
framework's *deploy procedures* depend on the fleet's domain.
Inversion of today's direction, where `harmony::modules::fleet`
imports from `harmony_secret`, `harmony_zitadel_auth`, NATS
client crates, kube client crates, etc.
After this ADR is implemented, `harmony::modules::fleet`
disappears entirely. `harmony` core stays focused on framework
primitives.
## Open questions
These are the decision points pending review — flagged so the
review has concrete pivots:
- **Q1.** Is `IdentityProvider` the right capability name, or
should we name the two distinct concerns separately
(`DeviceCredentialMinter`, `OperatorTokenProvider`)? CLAUDE.md
rule says "if reality has two distinct concerns, two
traits."
- **Q2.** Should the `Device` CRD exist at all, or should the
agent publish to a kube `Node` (per the alternative-D
direction)? Today's mid-ground (own CRD that mirrors `Node`)
arguably the worst of both worlds.
- **Q3.** Where does `ReconcileScore`'s adjacently-tagged enum
live? It's the canonical wire seam between operator and
agent. Should sit in `harmony-reconciler-contracts` (so both
binaries import only that crate); confirm before the move.
- **Q4.** Does this redesign block the v0.1 production push, or
does it land in v0.2 alongside the agent-upgrade work
(ADR-022)? Public API churn after a customer is on it is more
expensive than a 3-day delay before they are. Recommendation:
ship the redesign first.
- **Q5.** Where do runtime tools (the `harmony-fleet` CLI plugin,
the operator's frontend) sit in the dependency graph? If they
depend on `harmony-fleet-domain` only, they build without
pulling in helm/kube/ansible at compile time — which is also
the right shape for the device-side enrollment binary
(currently feature-gated).
## Out of scope
- **Alternative D (kube-native devices).** A future v2.0
destination, not v0.1 or v0.2 work. Captured as the long-term
direction; the capability traits in this ADR are the
intentional seam that makes the migration possible later.
- **Topology decomposition.** Whether `K8sBareTopology` /
`K8sAnywhereTopology` should themselves be capability sets is a
separate concern. Tracked as a working draft at
`docs/adr/drafts/topology-proliferation.md`.
## Consequences
If accepted:
- New deployable fleet components author their Scores against
capability traits in `harmony-fleet-domain`, not against
provider clients directly. Swapping NATS for a different
transport becomes a single-crate change.
- CRD types move out of operator code and into
`harmony-fleet-adapters-kube`. Operator depends on adapter
crate; runtime binary stays slim.
- `harmony` core has no fleet code. The framework's `modules/`
directory is reserved for general-purpose primitives (DNS,
K8s, Helm, NATS, PostgreSQL, …); domain-specific code lives
in its own crate tree.
- Future fleet adapters (a different transport, a different
identity provider) are additive: one new crate, no changes to
domain or deploy.
## References
- `ROADMAP/fleet_platform/architecture_review.md` §§45 —
comparison matrix and Alternative-B rationale from which this
ADR is extracted.
- `docs/adr/023-deploy-architecture.md` — companion ADR for the
deploy-side rules. This ADR is the domain-side companion.
- `docs/adr/022-fleet-agent-upgrade.md` — the agent-upgrade
procedure, which sits cleanly on top of the
`AgentLifecycle` capability proposed here.