Files
harmony/docs/adr/024-fleet-platform-capability-decomposition.md
2026-05-20 12:03:19 -04:00

7.1 KiB
Raw Permalink Blame History

Architecture Decision Record: Fleet Platform — Capability Decomposition

Initial Author: Jean-Gabriel Gill-Couture (with research by Claude)

Initial Date: 2026-05-20

Last Updated Date: 2026-05-20

Status

Draft — under review. Captures the proposed shape for review; not yet locked. If accepted, supersedes the as-built layout of harmony/src/modules/fleet/ documented in ADR-023's first revision.

Context

The fleet platform shipped under feat/iot-walking-skeleton spans three concerns that today share two locations:

  1. Domain logic — what a FleetDevice is, what a FleetDeployment looks like, what the reconciler-contracts wire types mean.
  2. Adapters — concrete NATS, Zitadel, Kubernetes, Helm integrations that drive the domain.
  3. Deploy procedures — how to bring up the operator, agent, NATS, Zitadel as Scores against a Topology.

Today these live in harmony/src/modules/fleet/ (mixed), the harmony-reconciler-contracts crate (wire types only), the harmony-fleet-deploy crate (Scores for deploy), and the harmony-fleet-operator/harmony-fleet-agent binaries (runtime). The boundary between domain and adapter is not type-level: harmony/src/modules/fleet/setup_score.rs for example reaches into Zitadel, NATS, Kube, and Helm directly. Anyone wanting to swap NATS for a different transport would touch every fleet file.

ADR-023 already addressed the deploy-side of this (deploy Scores live in *-deploy crates, not in harmony core). This ADR proposes the domain-side decomposition: pull a thin fleet-domain crate above the existing reconciler-contracts, push provider-specific code into adapter crates, and re-direct the deploy crate to consume the domain rather than the framework primitives directly.

Decision (proposed)

Five crates, layered by dependency direction:

harmony-reconciler-contracts      (existing — wire types only)
        ▲
        │
harmony-fleet-domain               (new — domain records + capability traits)
        ▲
        │
harmony-fleet-adapters-*           (new — one crate per provider)
        ▲                          (nats, zitadel, kube)
        │
harmony-fleet-deploy               (existing — bring-up Scores)
harmony-fleet-operator             (existing — daemon)
harmony-fleet-agent                (existing — daemon)

harmony-fleet-domain

The domain crate. Depends only on harmony-reconciler-contracts and harmony_types. Holds:

  • Domain records: FleetDevice, FleetDeployment, FleetState, EnrollmentIntent, DeviceCredential.
  • Capability traits: DeviceRegistry, DesiredStatePublisher, ObservedStateConsumer, IdentityProvider, AgentLifecycle. These are the seam between domain logic and provider-specific implementations.

harmony-fleet-adapters-nats, -zitadel, -kube

One crate per provider. Each implements the capability traits above for its specific backend:

  • natsNatsDeviceRegistry, NatsDesiredStatePublisher, NatsObservedStateConsumer.
  • zitadelZitadelIdentityProvider, machine-user provisioning, JWT-bearer minting.
  • kubeKubeFleetReflector writes Device and Deployment CRDs as a reflection of domain state, not as the source of truth. CRD types move here from harmony-fleet-operator.

harmony-fleet-deploy

Stays as the home for FleetOperatorScore, FleetAgentScore, FleetNatsScore, FleetCalloutScore. Updates: imports harmony-fleet-domain for types, uses harmony-fleet-adapters-* to compose Scores against capability traits rather than reaching directly into NATS/Zitadel client crates.

Direction of dependency

The fleet domain doesn't depend on the framework. The framework's deploy procedures depend on the fleet's domain. Inversion of today's direction, where harmony::modules::fleet imports from harmony_secret, harmony_zitadel_auth, NATS client crates, kube client crates, etc.

After this ADR is implemented, harmony::modules::fleet disappears entirely. harmony core stays focused on framework primitives.

Open questions

These are the decision points pending review — flagged so the review has concrete pivots:

  • Q1. Is IdentityProvider the right capability name, or should we name the two distinct concerns separately (DeviceCredentialMinter, OperatorTokenProvider)? CLAUDE.md rule says "if reality has two distinct concerns, two traits."
  • Q2. Should the Device CRD exist at all, or should the agent publish to a kube Node (per the alternative-D direction)? Today's mid-ground (own CRD that mirrors Node) arguably the worst of both worlds.
  • Q3. Where does ReconcileScore's adjacently-tagged enum live? It's the canonical wire seam between operator and agent. Should sit in harmony-reconciler-contracts (so both binaries import only that crate); confirm before the move.
  • Q4. Does this redesign block the v0.1 production push, or does it land in v0.2 alongside the agent-upgrade work (ADR-022)? Public API churn after a customer is on it is more expensive than a 3-day delay before they are. Recommendation: ship the redesign first.
  • Q5. Where do runtime tools (the harmony-fleet CLI plugin, the operator's frontend) sit in the dependency graph? If they depend on harmony-fleet-domain only, they build without pulling in helm/kube/ansible at compile time — which is also the right shape for the device-side enrollment binary (currently feature-gated).

Out of scope

  • Alternative D (kube-native devices). A future v2.0 destination, not v0.1 or v0.2 work. Captured as the long-term direction; the capability traits in this ADR are the intentional seam that makes the migration possible later.
  • Topology decomposition. Whether K8sBareTopology / K8sAnywhereTopology should themselves be capability sets is a separate concern. Tracked as a working draft at docs/adr/drafts/topology-proliferation.md.

Consequences

If accepted:

  • New deployable fleet components author their Scores against capability traits in harmony-fleet-domain, not against provider clients directly. Swapping NATS for a different transport becomes a single-crate change.
  • CRD types move out of operator code and into harmony-fleet-adapters-kube. Operator depends on adapter crate; runtime binary stays slim.
  • harmony core has no fleet code. The framework's modules/ directory is reserved for general-purpose primitives (DNS, K8s, Helm, NATS, PostgreSQL, …); domain-specific code lives in its own crate tree.
  • Future fleet adapters (a different transport, a different identity provider) are additive: one new crate, no changes to domain or deploy.

References

  • ROADMAP/fleet_platform/architecture_review.md §§45 — comparison matrix and Alternative-B rationale from which this ADR is extracted.
  • docs/adr/023-deploy-architecture.md — companion ADR for the deploy-side rules. This ADR is the domain-side companion.
  • docs/adr/022-fleet-agent-upgrade.md — the agent-upgrade procedure, which sits cleanly on top of the AgentLifecycle capability proposed here.