The previous e2e harness handrolled k8s manifests in `stack.rs`,
bypassing the Score-Topology-Interpret machinery harmony exists to
provide. This commit:
1. **ADR-023** codifies the rules: deploy with Scores (not
manifests), e2e uses the same Scores as production, one Score
per component, deploy blocks on smoke-test success, deploy logic
lives in `*-deploy` crates, topologies are compile-time,
thiserror over anyhow. CLAUDE.md mirrors the principles.
2. **New `fleet/harmony-fleet-deploy` crate** is the canonical home
for fleet-component Scores:
- `FleetOperatorScore` + helm-chart generator + `install_crds`
moved out of `harmony::modules::fleet::operator` (they should
never have lived in `harmony` core). `FleetServerScore`
(composite of NATS + operator + Zitadel + callout) moved too.
- New `FleetNatsScore` (preset over `NatsHelmChartScore` with
fleet's required values; v1 supports `UserPass` auth, callout
mode reserved on the public API for PR 1.5).
- New `FleetAgentScore` with `FleetAgentTarget::Pod`; `Vm`
target is a future variant that absorbs `FleetDeviceSetupScore`.
- `harmony-fleet-deploy` binary built on the existing
`harmony_cli` crate — no new CLI scaffolding.
3. **Operator runtime binary trimmed**: `Install` and `Chart`
subcommands removed; both jobs now belong to
`harmony-fleet-deploy`. The runtime binary becomes leaner.
4. **E2E harness rewritten** as a thin Score composer:
`harmony-fleet-e2e/src/stack.rs` deploys the stack via
`FleetNatsScore` + `FleetAgentScore`. The inline NATS manifest
factory and the bespoke agent Pod renderer are gone.
- Bring-up runs once per test binary via `shared_stack` +
`tokio::sync::OnceCell` (matches the `fleet_e2e_demo` pattern).
- Stale `e2e-*` namespaces from prior runs get pruned at
startup so the leaks the OnceCell creates don't compound.
5. **`thiserror` for the agent's `CommandServer`** — replaces the
anyhow-based surface with typed `CommandError` /
`CommandServerError`.
6. **Memory** captures eight load-bearing principles (saved to
`~/.claude/projects/.../memory/`) so future sessions don't drift
back into manifest-handrolling.
Verified: `cargo test -p harmony-fleet-e2e --test ping` green
end-to-end against k3d in 25s warm.
8.2 KiB
Architecture Decision Record: Deploy Architecture — Scores, Deploy Crates, and the E2E Contract
Initial Author: Jean-Gabriel Gill-Couture
Initial Date: 2026-05-18
Last Updated Date: 2026-05-18
Status
Accepted. Enforces the principles already documented in
CLAUDE.md (Score-Topology-Interpret); this ADR adds the deploy
side of the contract (what a deploy crate is, how e2e harnesses
relate to production deploys, how the CLI surface is shaped) and
the smoke-test-on-deploy semantics.
Context
The harmony codebase has drifted in three related ways that this ADR exists to stop:
-
Test harnesses handrolling k8s manifests. Recent e2e harness work (
fleet/harmony-fleet-e2e/src/stack.rs, since reverted in this PR) inlinedDeployment/Service/ConfigMapstructs viak8s_openapi::api::*. That's the YAML-mud-pit anti-pattern harmony exists to eliminate, dressed up in Rust. The fact that e2e bypassed Scores while production used them meant e2e and prod could diverge silently. -
Deploy logic scattered between
harmonycore and example crates.FleetOperatorScorelives inharmony/src/modules/ fleet/operator/, but its real "how to apply this end-to-end" logic ended up duplicated acrossexamples/fleet_e2e_demo/src/ lib.rs::deploy_operatorand the operator's ownChartsubcommand. There's no single place a developer goes to deploy the fleet operator. -
deployreturning before convergence. TodayOutcome:: SUCCESSmeans "apply submitted", same ashelm install. Users are left to figure out whether the result actually works. That's exactly the user-experience problem harmony was built to fix.
The pattern below is the cleanup.
Decision
Nine principles, grouped.
Deployment as Scores
-
Deploy with Scores, not handrolled manifests. Capability traits + compile-time bounds are the contract. No
k8s_openapi::api::*structs outside ofScore::interpretbodies. Test harnesses, examples, and CLI helpers compose*Scoretypes — they never reimplement deploys. -
E2E uses the same Scores as production. Only the
Topologyinstance changes (local k3d, remote OKD, bare-metal HA, …). A test harness is aScore-composer running against a test Topology. If e2e needs something prod doesn't, add the knob to the Score — don't fork the manifest in the harness. -
One Score per deployable component. Composition is the user-facing primitive:
MyAppScorepulls inPostgresScore,HttpServerScore, etc. Don't build monolithic "deploy everything" Scores. Each primitive Score must be independently testable and substitutable. -
Deploy returns only after smoke-test success. Every Score owns a readiness + smoke-test contract that the framework runs and blocks on. Convergence errors must be actionable, in the style of
rustc's error messages, not "exit code 1 from helm". The implementation of the smoke-test contract (separate trait? required Score method? companion struct?) is left open for a follow-up ADR; the principle is locked in.
Where deploy logic lives
-
Deploy logic lives in a
*-deploycrate that depends on bothharmonyand the runtime crate. Runtime binaries (the thing that ships to constrained devices and to in-cluster pods) stay free of theharmonydep. Pattern already established byharmony_agent/deploy.For the fleet stack specifically: one
fleet/harmony-fleet- deploycrate holds every fleet-component Score (FleetOperatorScore,FleetAgentScore,FleetNatsScore,FleetCalloutScore). The same crate is consumed by:- production CLI (
harmony-fleet-deploy <component> --topology <name>) - the e2e harness (composes the Scores against a k3d Topology)
- whatever future control-plane / web tool drives deploys
Fleet Scores that currently live in
harmony/src/modules/fleet/are migrated intoharmony-fleet-deploy— they should never have been in harmony core. This is a one-shot move done in the PR that introduces the deploy crate. - production CLI (
Topology selection
- Topologies are compile-time, selected at runtime. A deploy
binary statically lists its supported topologies; the user
picks one at deploy time. Adding a brand-new topology backend
(AWS, GCP) is a rebuild — acceptable cost, because dynamic-
discovery topologies like
K8sAnywherealready cover "any physical place that runs k8s". NoBox<dyn Topology>plugin loaders.
Framework evolution
- Extend Scores with companions, not API changes. New
capabilities the framework wants to attach to Scores (planning,
dry-run, observability, eventually smoke-test) default to a
companion type or trait that wraps a Score rather than a new
method on
Score/Interpret. The base public API stays simple. The exception is principles every Score must honor (which may force a required method) — but only after the principle has been validated in practice via the companion-first iteration.
CLI
- CLI: hybrid, staged. Today (B): first-party tools ship as
separate
harmony-*binaries built on the existingharmony_clicrate. Improve that surface. Tomorrow (C): a top-levelharmonybinary discoversharmony-*plugin binaries on$PATH(kubectl-style) so a third-partyMyAppScoreauthor getsharmony deploy my-appfor free. The plugin protocol is not in scope for any current PR; it's a dedicated future effort.
Error handling
- thiserror almost everywhere; anyhow only at binary glue.
Library code, public crate boundaries, anything callers might
want to match on — typed errors via
thiserror.anyhowis reserved formain.rs-level glue where the error is just printed. This was the second drift this PR uncovered.
Out of scope (deferred, not rejected)
- Score derive macro / deployment DSL. Strategic intent from day one; the framework's value-add concentrates here. Separate design effort.
- Score registry (Crichton-style: https://willcrichton.net/rust-api-type-patterns/registries.html). Real itch — examples and Scores are hard to discover today. Research + ADR first.
- Inventory as capability-defined physical assets. Inventory is massively under-engineered today; the original idea is to represent physical infrastructure (building → cable → switch port → MAC) but most use cases ignore it. Decomposing inventory into a capability set is a deep redesign.
- Plug-in CLI discovery layer (C above). Roadmap item; explicitly named as the fix for the "too many disconnected CLIs" cohesion problem.
Application features↔capabilitiesrelationship. In-progress concept the project lead is personally unsure about. Don't try to resolve in this ADR.- Concrete smoke-test contract shape. See principle 4 — principle locked, implementation deferred. Today's e2e test suite plays the role of the smoke test until the trait/struct shape is decided.
Consequences
- The current
harmony-fleet-e2e/src/stack.rs(introduced in this PR) is wrong and gets rewritten as a Score composer with no inline k8s manifests. harmony::modules::fleet::operator::*and any other fleet-deploy modules inharmonycore move intofleet/harmony-fleet-deploy. Callers (examples/fleet_e2e_demo, the operator binary itself) get updated paths.- New Scores (
FleetAgentScore,FleetNatsScore) land infleet/harmony-fleet-deploy. The agent crate gains nothing — it stays a lean runtime binary. - The deploy crate gets its own
main.rsdriven byharmony_cli, exposing one subcommand per component plus anallcomposite. - Future work (smoke-test contract, Score derive macro, registry, CLI discovery) gets dedicated ADRs/PRs and does not sneak into unrelated work.
References
CLAUDE.md— Score-Topology-Interpret pattern, capability design rules.docs/adr/002-hexagonal-architecture.md— domain/adapter split this builds on.docs/adr/005-rust-dsl-over-yaml.md— the original "no YAML-mud-pit" call.harmony_agent/deploy— existing*-deploycrate pattern.fleet/PLAN_requests_over_nats.md— the working plan for the request/reply work this ADR landed during.