Files
harmony/fleet
Jean-Gabriel Gill-Couture bc2edf4530 feat(podman): init containers with k8s-style run-to-completion semantics
Customer apps frequently need a one-shot setup step (DB migration,
config render, cache warm-up) to succeed before the long-running
service starts. Without init containers each customer either inlines
the step into the service entrypoint (slow, racy, no failure surface)
or bolts on a sidecar that the platform can't introspect. This change
adds k8s-style init containers at the score layer so the contract is
the same one the customer already knows.

Score:
- New `InitContainer { name, image, args, env, volumes, timeout }`
  in `harmony::modules::podman`.
- `PodmanV0Score.init_containers: Vec<InitContainer>` with
  `#[serde(default)]` — pre-init-container wire payloads parse as an
  empty vec and behave unchanged.
- `DEFAULT_INIT_CONTAINER_TIMEOUT = 300s`; timeout serializes as
  whole seconds for operator readability.
- Idempotency is the customer's contract — documented at module
  level: init containers re-run on every reconcile that needs a
  fresh main container set.

Runtime contract:
- `ContainerRuntime::run_to_completion(spec, timeout) -> RunOutcome`
  added to the domain trait. `RunOutcome::Exited { exit_code }`
  vs `TimedOut { waited }` — distinct arms because the caller's
  failure path is different (operator gets the exit code for
  actionable diagnosis).
- Init containers are NOT surfaced via `list_managed_services`;
  they're removed after they exit so the host's managed-container
  surface stays bounded to long-running services.

PodmanTopology implementation:
- Pre-remove any prior container with the same name (retry-safe).
- Restart policy forced to `No` — a retrying init defeats the
  run-to-completion contract.
- `tokio::time::timeout` around `podman wait`; force-remove + return
  `TimedOut` on deadline.
- Single 200ms retry on inspect for the libpod race where state can
  briefly read `running` between `wait` returning and conmon writing
  the exit code.
- `INIT_CONTAINER_LABEL` on every init container so operators can
  `podman ps -a --filter label=...` to spot init failures.

Interpret:
- Init containers run sequentially before any service. Non-zero exit
  or timeout fails the deployment with a typed `InterpretError`
  carrying the container name + cause.
- Success message reports both counts.

Tests (in tree):
- 3 new wire-format tests in `podman::score`: roundtrip, default
  timeout hydration, ordering preservation.
- All 10 existing podman::score tests still pass; legacy roundtrip
  test now also asserts `init_containers.is_empty()` as a wire-compat
  canary.

Call-site updates (5 sites) — all existing constructors of
`PodmanV0Score` add `init_containers: vec![]`: harmony_apply_deployment
example, fleet_load_test example, operator e2e, vm_deploy_lifecycle
e2e, vm_isolation e2e.

Deferred: per-version "run-once" semantics (customer can build with a
marker file today); the agent-side handler for surfacing init logs to
the operator dashboard (covered by the logs companion PR's deferred
work).
2026-05-24 21:56:39 -04:00
..
2026-05-22 17:04:49 -04:00
2026-05-22 12:39:43 -04:00

Harmony Fleet

IoT / decentralized-edge orchestration for harmony. A fleet stack is:

Component Crate Role
Operator harmony-fleet-operator Watches Deployment CRs, writes desired state into NATS JetStream KV, aggregates device state back into CR status. Runtime binary; no harmony dep.
Agent harmony-fleet-agent One per device. Watches the desired-state KV, drives the local runtime (podman today), publishes heartbeats + per-deployment state, answers device-commands.* request/reply.
Auth harmony-fleet-auth Shared NATS credential plumbing — TomlShared (dev) and ZitadelJwt (prod with auth-callout).
Deploy harmony-fleet-deploy The canonical deploy crate. Imports harmony and exposes one *Score per component (FleetOperatorScore, FleetAgentScore, FleetNatsScore, FleetServerScore). Both the production CLI and the e2e harness compose these — see ADR-023.
E2E harness harmony-fleet-e2e Brings the stack up in a fresh k3d namespace and runs integration tests against it.

The on-the-wire types both ends agree on (KV bucket names, key formats, command-protocol payloads) live in ../harmony-reconciler-contracts.

Architecture in one line

FleetOperatorScore, FleetAgentScore, etc. are real Rust types with capability-bound Topology parameters. Production deploys, the e2e harness, and any future control-plane tool all compose the same Scores; the only thing that changes is the Topology instance. No handrolled YAML or imperative manifest factories anywhere. Read ADR-023 before adding deploy logic.


Quickstart — run the e2e ping test

The fastest path to a green fleet stack on your laptop. Requires podman, kubectl, and helm on $PATH; everything else (k3d, the NATS chart, all images) is fetched / built on demand.

HARMONY_FLEET_E2E=1 cargo test -p harmony-fleet-e2e --test ping -- --nocapture

What it does, in order:

  1. Ensures a fleet-e2e k3d cluster exists (creates one if not). NodePort 30423 on the host forwards to NATS inside the cluster.
  2. Builds harmony-fleet-agent in release mode, packages it into localhost/harmony-fleet-agent:e2e, and sideloads the image into the k3d cluster's containerd store.
  3. Mints a per-bring-up namespace e2e-<uuid8> and prunes any leftover e2e-* namespaces from prior runs (NodePort 30423 is cluster-scoped, so a stuck Terminating namespace would block the new bring-up — the prune waits up to 90 s for full cleanup before proceeding).
  4. Deploys NATS via FleetNatsScore (helm chart, JetStream on, static admin/device users, NodePort Service).
  5. Waits for NATS to be reachable from the host on nats://localhost:30423 (admin/e2e-admin).
  6. Deploys one FleetAgentScore { target: Pod } — runs with runtime_enabled = false so it skips podman and only runs the command-server + heartbeat loop.
  7. Waits for the agent Deployment to be Ready.
  8. The test publishes device-commands.<device_id>.ping via FleetCommandsClient::ping and asserts the agent replies with { device_id, agent_version, uptime_s }.

Cold first run: ~80 s (release build of the agent dominates). Warm: ~25 s.

Useful env knobs

Var Effect
HARMONY_FLEET_E2E=1 Required. Without it the test is skipped — keeps cargo test --workspace cheap on machines without k3d.
FLEET_E2E_KEEP=1 Skip namespace teardown on Drop. Lets you kubectl -n e2e-<…> logs deploy/… after a failure. The next run prunes it.
RUST_LOG=info Or debug for the per-message command dispatch traces inside harmony-fleet-agent::command_server.

Connecting to NATS while the stack is up

# Host-side, via the NodePort
nats://localhost:30423           # user=admin pass=e2e-admin (full access)
nats://localhost:30423           # user=device pass=e2e-device (device permissions)
# In-cluster, from any Pod in the same namespace
nats://fleet-nats.e2e-<uuid8>.svc.cluster.local:4222

FLEET_E2E_KEEP=1 + the harness's stdout line [e2e] NATS: nats://127.0.0.1:30423 … is the path most tests will take — leave the harness running, point a NATS client at that URL.

Inspecting the agent

# Find your namespace
kubectl get ns -l harmony.io/managed-by=fleet-e2e

# Tail the agent
kubectl -n e2e-<uuid8> logs deploy/fleet-agent-<device-id> -f

# Tail NATS (StatefulSet, not Deployment)
kubectl -n e2e-<uuid8> logs sts/fleet-nats -c nats -f

# Send a ping by hand (requires the `nats` CLI:
#   https://github.com/nats-io/natscli/releases)
nats --server nats://localhost:30423 --user admin --password e2e-admin \
     request "device-commands.vm-device-00-<uuid8>.ping" ""

Or if you don't want to install the nats binary :

alias natsbox='podman run --network=host --rm docker.io/natsio/nats-box:latest nats --server nats://localhost:30423 --user admin --password e2e-admin'

You should see something like {"device_id":"vm-device-00-<uuid8>","agent_version":"0.1.0","uptime_s":12}.

Cleaning up

The shared OnceCell in harmony-fleet-e2e lives for the test binary's lifetime, so namespaces survive a cargo test exit (the static is never explicitly dropped). The next cargo test invocation prunes them. To force a manual cleanup:

kubectl delete ns -l harmony.io/managed-by=fleet-e2e
# wipe the whole cluster:
k3d cluster delete fleet-e2e

Production deploys

harmony-fleet-deploy is the binary that puts the fleet stack on a real cluster (OKD, vanilla k8s, anywhere K8sAnywhereTopology can reach). It composes FleetNatsScore + FleetOperatorScore + FleetAgentScore against the topology you point it at.

# Default: K8sAnywhereTopology against whatever KUBECONFIG points at
cargo run -p harmony-fleet-deploy -- \
  --namespace fleet-system \
  --operator-image hub.nationtech.io/harmony/harmony-fleet-operator:dev \
  --agent-image   hub.nationtech.io/harmony/harmony-fleet-agent:dev \
  --agent-device-id fleet-agent-01

# Pick a single component with the harmony_cli filter
cargo run -p harmony-fleet-deploy -- \
  --namespace fleet-system \
  -- --filter FleetOperatorScore --all

harmony-fleet-deploy reads its full config from CLI flags + env vars (FLEET_NAMESPACE, FLEET_OPERATOR_IMAGE, …). The minimal-CLI surface is deliberate — per ADR-023 the long-term answer is a plugin-discovery layer over harmony-* binaries; until that lands, deploy crates stay small and use the existing harmony_cli.

Connecting to the operator

The operator runs as a single-replica Deployment in --namespace (default fleet-system).

# Tail logs
kubectl -n fleet-system logs deploy/harmony-fleet-operator -f

# Port-forward the embedded web dashboard (web-frontend feature)
kubectl -n fleet-system port-forward deploy/harmony-fleet-operator 18080:18080

# Or run the dashboard standalone with seeded fake data — no NATS, no cluster
cargo run -p harmony-fleet-operator --features web-frontend -- serve-web --mock
# browse http://127.0.0.1:18080

Existing manual rehearsal — examples/fleet_e2e_demo

examples/fleet_e2e_demo brings up a fuller stack than the e2e harness — real Zitadel, the auth-callout, libvirt VM agents over SSH — at the cost of a 5-min cold start. It's the manual rehearsal flow; not what you want during the dev loop. See the example's RUNBOOK.md.

The harness and the rehearsal will converge: the follow-up PR lifts FleetCalloutScore + a mock-OIDC fixture into harmony-fleet-deploy, at which point the harness can run the full production auth path in ~30 s instead of 5 min, and fleet_e2e_demo thins down to a caller over the same Scores.


What's next

This branch lands the deploy-architecture cleanup (ADR-023), the per-component Scores, and the ping path. Slated immediately after:

  1. Zitadel + auth callout in harmony-fleet-deploy. New FleetCalloutScore (preset over NatsAuthCalloutScore) plus an in-cluster mock-OIDC fixture so the e2e harness can exercise the real auth-callout code path without paying Zitadel's 5-min cold-start cost. The harness's AuthMode::Callout variant is already on the public API for this.
  2. Operator pod in the e2e harness. FleetOperatorScore is already in the deploy crate; wiring it into the harness gives integration tests against the actual Deployment / Device reconcile loops.
  3. Verb::Logs and Verb::Exec — the next two verbs on the device-commands.* protocol. Same harness, same TDD shape as ping.
  4. CRD types out of harmony core. harmony::modules::fleet::operator::crd is the last fleet-deploy thing still living in harmony. The ReconcileScore payload coupling is the only blocker.
  5. Smoke-test contract. ADR-023 principle 4 — every Score blocks on a smoke test before deploy returns success. Today the e2e suite plays that role; the trait/companion shape lands once it's been validated in practice.

See PLAN_requests_over_nats.md for the full TDD-style plan this branch implements.