feat: scaffold IoT walking skeleton — podman module, operator, and agent #264
Reference in New Issue
Block a user
No description provided.
Delete Branch "feat/iot-walking-skeleton"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Implement the A1 task from the IoT walking-skeleton roadmap: - CRD (kube-derive): `iot.nationtech.io/v1alpha1/Deployment`, namespaced, with `targetDevices`, `score {type, data}`, `rollout.strategy`, and a status subresource carrying `observedScoreString`. - Controller: `kube::runtime::Controller` + `finalizer` helper. On Apply, writes `<device_id>.<deployment_name>` into NATS KV bucket `desired-state` and patches `.status.observedScoreString` via server-side apply. Skips KV write + status patch when the score is unchanged to avoid reconcile-loop churn. On Cleanup, removes the per-device keys before releasing the finalizer. - CLI: `gen-crd` subcommand prints the CRD YAML from the Rust types; `run` (default) starts the controller. `deploy/crd.yaml` is generated by that subcommand — single source of truth, no drift. - Deploy manifests: `deploy/operator.yaml` (Namespace, SA, ClusterRole, ClusterRoleBinding, Deployment) and generated `deploy/crd.yaml`. Agent fixes surfaced while aligning with the operator's key layout: - Watch filter: was `starts_with("desired-state.<id>.")` on `watch_all()`; bucket name is not a key prefix, so it never matched. Now uses `bucket.watch("<id>.>")` with the NATS wildcard and handles `Put`/`Delete`/`Purge` distinctly. - Multi-server connect: was joining `nats.urls` with `","` into a single malformed URL. Pass the `Vec<String>` to `ConnectOptions::connect`. - `credentials.type` is now validated (rejects unknown discriminators) so a v0.2 `zitadel` config doesn't silently fall back to shared creds. Verification on feat/iot-walking-skeleton: - cargo clippy --no-deps -D warnings: clean (agent + operator). - cargo fmt --check: clean. - x86_64 + aarch64 cross-compile: both build. - podman module unit tests: pass.`iot/scripts/smoke-a1.sh` drives the A1 acceptance flow end-to-end: spins up NATS and a k3d cluster via podman, applies the generated CRD, runs the operator, applies a Deployment CR, asserts the expected `<device>.<deployment>` key lands in the `desired-state` KV bucket and `.status.observedScoreString` round-trips the same JSON, then deletes the CR and asserts the finalizer removes the KV key. Cleans up on exit. Two fixes surfaced while running it: 1. `ScorePayload.data: serde_json::Value` generated an empty `{}` schema, which the API server rejects. Attach a `schemars(schema_with = preserve_arbitrary)` helper that emits `x-kubernetes-preserve- unknown-fields: true`, letting the Score payload be any JSON shape. 2. `Patch::Merge` combined with `PatchParams::apply(...).force()` is rejected by kube-rs (force is Apply-only). Use a plain `Merge` patch for the status subresource — simpler and correct for v0.Code review guide — feat/iot-walking-skeleton
One-line summary. Thin end-to-end thread for the IoT platform (ROADMAP/iot_platform/v0_walking_skeleton.md): kubectl apply Deployment in a
central cluster → operator writes to NATS KV → on-device agent runs the container, all in Rust with Harmony's Score/Topology/Interpret
pattern.
Size. 5 commits, ~3k lines added, almost all new code — two new crates (iot-operator-v0, iot-agent-v0), one new Harmony module
(modules/podman/), one new capability trait (domain::topology::ContainerRuntime), one new inventory constructor, one smoke test script.
The 5 commits, what each is for
┌─────┬───────────────────────┬─────────────────────────────────────────────────────────────────────┬───────────────────────────────────┐
│ # │ Commit │ What it does │ Review weight │
├─────┼───────────────────────┼─────────────────────────────────────────────────────────────────────┼───────────────────────────────────┤
│ 1 │
65ef540scaffold │ Two iot crates with stub binaries, modules/podman/ with typed Score │ Skim — structural, low risk ││ │ │ + stub interpret, workspace plumbing │ │
├─────┼───────────────────────┼─────────────────────────────────────────────────────────────────────┼───────────────────────────────────┤
│ 2 │
e50ab74operator │ kube-rs Controller + finalizer, Deployment CRD types, deploy │ Heavy — this is the whole ││ │ controller │ manifests, gen-crd subcommand │ operator │
├─────┼───────────────────────┼─────────────────────────────────────────────────────────────────────┼───────────────────────────────────┤
│ 3 │
1c91634smoke test + │ smoke-a1.sh, x-kubernetes-preserve-unknown-fields on score.data, │ Medium — fixes are small, script ││ │ CRD/patch fixes │ drop invalid Patch::Merge + .force() combo │ is important │
├─────┼───────────────────────┼─────────────────────────────────────────────────────────────────────┼───────────────────────────────────┤
│ │
d21bdefCEL │ Hand-rolled schema_with emitting x-kubernetes-validations that │ Medium — the schemars::r#gen ││ 4 │ validation │ forces score.type to match a Rust identifier │ escape hatch is the only │
│ │ │ │ non-obvious part │
├─────┼───────────────────────┼─────────────────────────────────────────────────────────────────────┼───────────────────────────────────┤
│ 5 │
1112125agent │ ContainerRuntime trait, PodmanTopology, PodmanV0Interpret wired to │ Heavy — largest, most novel ││ │ reconciliation │ podman-api, Inventory::from_localhost, reconciler with 30s tick │ │
└─────┴───────────────────────┴─────────────────────────────────────────────────────────────────────┴───────────────────────────────────┘
Suggested reading order
Don't read chronologically — read by architecture layer.
Confirm the capability shape, the MANAGED_BY_LABEL fleet-safety story, the intentional "Podman-shaped not CRI-shaped" scope comment.
adjacently-tagged serde enum (#[serde(tag = "type", content = "data")]), the existing round-trip tests. This is the only polymorphic variant
today; the shape is designed for OkdApplyV0 / KubectlApplyV0 later without operator changes.
hand-rolled score_payload_schema (commit 4), the finalizer helper usage, the no-op guard in apply() that skips KV write + status patch when
the score is unchanged.
with 5-min timeout per ROADMAP §5.6.
§5.5 string-compare idempotency made explicit.
What to look at closely
CRD schema (commit 3 + 4). The x-kubernetes-preserve-unknown-fields: true extension is scoped to .spec.score.data only — everything else has
a strict schema. Verify this is still true after any rebase: grep -c preserve-unknown iot/iot-operator-v0/deploy/crd.yaml should be exactly
stays generic. See harmony/src/modules/podman/score.rs::deployment_label for how that string flows through.
serde_json::Value schema hack. crd.rs::preserve_arbitrary and score_payload_schema use schemars::r#gen::SchemaGenerator (raw-identifier
escape — gen is a 2024 keyword). This is the only place that synthesises non-derived OpenAPI schema; if schemars ever grows first-class
x-kubernetes-* support, this is the migration target.
Finalizer + status subresource. iot-operator-v0/src/controller.rs::reconcile uses kube::runtime::finalizer::finalizer(...) so delete goes
through Cleanup → KV key removed → finalizer released. Patch::Merge on the status subresource rather than Patch::Apply because .force() is
Apply-only (that's the fix in commit 3). Look for "drift between KV write and status patch" — no transaction across those two. If the KV
write succeeds and the status patch fails, the next reconcile retries both and the no-op guard sees the mismatch and re-writes. Fine for v0.
ContainerRuntime surface area. Three methods, no networks/volumes/stacks. Doc comment on the trait explains why — Docker likely fits without
change, Containerd/CRI-O need a separate capability. If a reviewer argues for Docker/Containerd compat today, push back: ROADMAP §6.4
explicitly says the capability must be a "real industry concept, not a tool" but the PostgreSQL exception applies here (the Score author
writes container-runtime-specific configs).
Agent reconcile loop. reconciler.rs::apply compares the incoming serialized JSON byte-string to the last-seen before dispatching — ROADMAP
§5.5 "change detection via string comparison (not content hash), cheap, deterministic." The 30s run_periodic tick re-runs every cached score
so podman rm outside the agent self-heals. remove() iterates the last-seen score's services; if the agent restarts after a delete it'll log
"unknown key — nothing to remove" which is correct.
Inventory::from_localhost. Minimal — single PhysicalHost in worker_host, hostname as a label, one synthetic CPU, one synthetic MemoryModule.
Everything else empty. If a Score later reaches for inventory.firewall_mgmt or inventory.switch it'll get the ManualManagementInterface /
empty vec, which is correct for a single-host topology.
Smoke test coverage. The test asserts (a) CRD validates, (b) CEL rejects a typo discriminator, (c) operator reconciles (KV put + status
set), (d) agent reconciles (container runs, curl passes), (e) delete propagates (KV gone, container gone). It does not test (a) multi-device
(b) drift recovery (c) agent restart (d) NATS restart. Those are v0.1 per the roadmap.
Intentional trade-offs the reviewer might flag
trust.").
deserialise.
the same pattern.
podman-api Rust crate, not shell-out").
Not in this branch (explicit v0 scope cuts, per ROADMAP §4)
How to verify locally
Prerequisite: rootless podman user socket (one-time)
systemctl --user enable --now podman.socket
Build everything
./build/check.sh # or at least cargo check --all-targets --all-features
aarch64 cross-compile sanity
cargo build --target aarch64-unknown-linux-gnu
-p harmony --features podman
-p iot-agent-v0 -p iot-operator-v0
End-to-end smoke (~1 minute)
./iot/scripts/smoke-a1.sh
KEEP=1 leaves NATS + k3d cluster up for manual poking afterwards
The smoke test teardown trap removes the demo container unconditionally, even with KEEP=1, so it won't leave a rogue nginx on host:8080.
One reviewer-requested thing I'd like a second opinion on
The ContainerRuntime trait has no notion of networks or volumes. For Docker parity it'd need both eventually. Two options when we get there:
(1) grow the trait, (2) add sibling capability traits (ContainerNetworks, ContainerVolumes) that a topology opts into separately. The
latter keeps Pi-sized topologies from having to implement stubs. Not a blocker for v0; worth an eventual decision.
Adds the plumbing so Harmony can both provision a VM to stand in for a fleet device and (re)configure any Linux host to join the fleet. The walking skeleton's "VM-as-device" test path needs all three pieces: - `domain::topology::HostConfigurationProvider` — new capability trait with `ensure_package`, `ensure_user`, `ensure_file`, `ensure_systemd_unit`, `restart_service`, `ensure_linger`, `ensure_user_unit_active`, and a reachability `ping`. Returns `ChangeReport { changed: bool }` so callers can reconcile-restart only when something actually changed. Trait doc calls out the narrow scope (not a general CM replacement) and the swappability story. - `modules::linux::AnsibleHostConfigurator` + `LinuxHostTopology` — concrete impl that shells out to `ansible-playbook --stdout-callback json`, one play per trait method, parsing the JSON for the task's `changed` flag. Deliberately the laziest reasonable adapter: when Ansible's error surface becomes painful, this is the piece we replace with a Rust-native impl behind the same trait, with zero Score churn. Runtime requirement: `ansible-playbook` (>= 2.15) on the Harmony runner host. - `modules::kvm::KvmVmScore` + cloud-init seed ISO generation — thin Score that wraps `KvmExecutor::ensure_vm` with a generated cloud-init seed ISO (hostname + authorized SSH key + sudoer user, nothing more). Uses `xorriso -as mkisofs` to build the ISO; returns the booted VM's IP. Docs note cloud-init is strictly for the VM test rig — customer Pi deployments go through rpi-imager / PXE instead. New `KvmHost` capability + `KvmHostTopology` expose the underlying `KvmExecutor`. - `modules::iot::IotDeviceSetupScore` — customer-facing Score bound to `T: Topology + HostConfigurationProvider`. Installs podman + system- d-container, creates the `iot-agent` system user with linger, activates user podman.socket, uploads the agent binary via a base64-in-tmpfile + oneshot unit pattern (docstring flags this as a v0.1 candidate for a proper remote-fetch), writes `/etc/iot-agent/config.toml` and the systemd unit, and restarts only if any of the config/unit/binary-install tasks reported changes. Re-running with a different `group` rewrites the TOML and bounces the agent. Scope note: this turn stops at one VM. Multi-VM + group routing is the next step — `group` in the config is a label that the agent will carry into its status bucket, but `Deployment.spec.targetGroups` isn't wired anywhere yet. `smoke-a3.sh` (VM-as-device end-to-end) lands in the next commit.Rewrites AnsibleHostConfigurator to avoid the two coupling points that last year's Kubespray investigation taught us to stay away from: YAML playbook generation and Ansible inventory. - **No more YAML, no more inventory files.** Every primitive is now one or two `ansible all -i '<ip>,' -m <module> -a '<json>'` ad-hoc invocations. JSON args go straight through Ansible's own module interface; the tmpfile-playbook-and-inventory dance is gone entirely. Harmony owns 100% of orchestration, Ansible owns only per-host idempotent module execution. `ensure_systemd_unit` collapses to two ad-hoc calls (copy + systemd) rather than a multi-task playbook. `ensure_linger` sentinels change-state through the shell module's stdout since ad-hoc has no `changed_when`. - **Self-installing venv.** New `modules::linux::ansible_venv`: `ensure_ansible_venv()` creates `$HARMONY_DATA_DIR/ansible-venv/` via `python3 -m venv` + `pip install ansible-core==2.17.*` on first use, cached via `tokio::sync::OnceCell`. No more "install ansible before running Harmony" step — python3 + venv is the only host requirement, and we print the exact package names for Arch/Debian/Fedora when python is missing. - **smoke-a3.sh**: drop `ansible-playbook` from preflight, add `python3`. Example gains `--bootstrap-ansible-only` for warming the venv ahead of the real run (turns a ~60s first-run smoke into deterministic sub-second after bootstrap). Output parsing uses the `oneline` callback (`host | VERB => {json}`) which is trivially regex-free to split and handles FAILED!/UNREACHABLE! as errors. SSH control sockets are pinned under `$HARMONY_DATA_DIR/ ansible-cp` so multiple Harmony processes don't race in /tmp. Verified: `ensure_ansible_venv()` first call installs ansible-core 2.17.14 into the managed venv (~12s, network-bound); second call is cache-fast (<50ms). Clippy + fmt clean, aarch64 cross-compile green.Eight fixes surfaced by actually running the VM-as-device flow end to end. All six commit deltas are small and self-contained. KvmVmScore + cloud-init: - **Overlay disk**: VM now boots off a per-VM qcow2 backed by the base image instead of writing into the base in-place. Re-runs of the same vm_name reuse the overlay (idempotent); fresh runs wipe the overlay so cloud-init starts clean. Requires `qemu-img`. - **UUID instance-id**: cloud-init's meta-data now carries a fresh UUID per seed build, so when the overlay gets recreated cloud-init treats it as a first boot and re-runs all per-instance modules. Without this, repeated runs silently skipped user/hostname/ssh setup. - **xorriso deadlock**: `.status()` with piped stderr filled the pipe buffer and SIGPIPE'd the child; switched to `.output()` which drains both. Also unlink any pre-existing seed ISO before running xorriso, since it otherwise treats the file as overwriteable input "media" and aborts with exit 32. - **wait_for_ip**: 180s → 300s. First boot of a cloud image on a constrained runner (or CI worker) can take 2-4 minutes. Ansible adapter — a half-dozen sharp corners of ad-hoc mode that only show up in a live run: - **`--ssh-common-args=VALUE`** (equals form, single token). Separate `--ssh-common-args VALUE` form has ansible's argparse re-interpret the `-o …` inside the value as its own `-o` flag and dump a help screen. Lost an afternoon to this decades ago on another project. - **Skip `-a` when empty**: `-a '{}'` trips ansible-core 2.17's "extra params" check on parameterless modules like `ping`. Pass no `-a` when the JSON dict is empty. - **`ANSIBLE_LOAD_CALLBACK_PLUGINS=True`**: ad-hoc mode silently ignores `ANSIBLE_STDOUT_CALLBACK` without this. Default callback produces multi-line JSON that's fragile to parse. - **`ANSIBLE_PIPELINING=True`**: required when `become`-ing an unprivileged user (iot-agent for the user-scope podman.socket), otherwise ansible's temp-file shuffle falls back to an ACL chmod syntax no Linux distro accepts. - **Parse shell/command oneline shape**: oneline callback emits `host | VERB | rc=N | (stdout) … | (stderr) …` for shell-style modules in addition to the `host | VERB => {json}` shape. Parser now handles both and synthesises a JSON payload from the shell form. - **Auto-create parent dir in ensure_file**: ansible's `copy` module won't create `/etc/iot-agent/` for you; a `file state=directory` call before every `copy` is idempotent and cheap. - **ensure_package uses apt directly**: `ansible.builtin.package` is distro-agnostic but doesn't auto-run `apt update`, so a fresh cloud image fails with "no package matching". Switched to `ansible.builtin.apt` with `update_cache=true, cache_valid_time=3600`. Debian-family only for v0 (ROADMAP §5.3); RHEL switch is a future capability refinement. HostConfigurationProvider surface: - **`FileSpec.source: FileSource`**: new `Content(String)` vs `LocalPath(PathBuf)`. LocalPath ships binary files over SFTP via ansible's native mechanism instead of passing base64 content through argv (which hit ARG_MAX on the ~10MB agent). This replaces the whole base64-in-tmpfile + oneshot install-unit dance in IotDeviceSetupScore — the binary now installs in a single idempotent `ensure_file` call that reports `changed` only when bytes differ. IotDeviceSetupScore: - Dropped the base64 + oneshot install machinery (80 fewer lines). - Dropped the explicit primary `group:` on ensure_user — Debian-family useradd auto-creates a group matching the username; setting `group:` required pre-creating it. smoke-a3.sh: builds iot-agent-v0 `--release` instead of debug (400MB debug binary filled the VM's thin-provisioned 3.5GB cloud rootfs). Verified end-to-end three times on this host: run 1: 9 changes (fresh install — package install, user create, binary, config, restart) run 2: 0 changes (true NOOP — `already configured`) run 3: 2 changes (group swap — only TOML + agent restart) Agent reports status.iot-smoke-vm into NATS after each run.Halfway through the review, many small things and a few bigger things to fix. Overall not terrible. But take the time to step back, understand clearly the code review and revisit the entire p-r with the comments in mind and improve it.
@@ -0,0 +25,4 @@/// deliberately Ansible-agnostic so a Rust-native impl can be dropped in/// later without Score changes.#[async_trait]pub trait HostConfigurationProvider: Send + Sync {I have some doubts about this trait.
First it is linux specific because of systemd and other linux specific references (which is not a problem outside the naming)
I also feel it is packing many things into a single interface and it is very likely to cause interface segregation and LSP problems.
Also many things this does require sudo (like creating a user) which might end up in cloud init for a reason or another, I think it could be misleading to have a single trait with implementations all over the place from cloud init to calling an ansible module to install a package, etc.
@@ -0,0 +35,4 @@pub struct IotDeviceSetupConfig {/// Stable device identifier. Written into the agent's TOML and used/// as the KV key prefix (`<device_id>.<deployment>`).pub device_id: String,This could very well use a harmony id from harmony_types. I like the id format, it is unique, relatively short and contains a timestamp.
@@ -0,0 +52,4 @@impl IotDeviceSetupConfig {/// Render the agent's `/etc/iot-agent/config.toml` content.pub fn render_toml(&self) -> String {This is ugly. Use a long string with format! or askama templates like we do in other places.
@@ -0,0 +87,4 @@/// Render the systemd unit file content.pub fn render_systemd_unit(&self) -> String {String::from("[Unit]Are there alternatives to systemd unit files to make sure it restarts on reboot? What about running the agent itself as a podman container? It would probably have to be privileged but that would reduce the configuration burden on the host and centralize our logic around podman instead of spreading it to systemd.
@@ -0,0 +45,4 @@cfg: &CloudInitSeedConfig<'_>,output_dir: &Path,) -> Result<PathBuf, KvmError> {if which_xorriso().await.is_none() {Why do we need that? It is yet another dependency? Can't we pass the cloud init any other way when creating the vm? Do we need that for cloud init as I'm assuming or something else?
@@ -0,0 +121,4 @@}fn render_user_data(cfg: &CloudInitSeedConfig<'_>) -> String {let mut s = String::new();unreadable crap
@@ -0,0 +15,4 @@/// `VirtualMachineHost` capability can be introduced then, and we'll/// either implement it *in terms of* `KvmHost` or drop `KvmHost`/// altogether.pub trait KvmHost {I tend to disagree with the comment that we need to be tool specific here.
All we really need is :
a vm with given cpu, iso, ip address, storage size, cloud init . This is absolutely not tool specific. The tool specific details can be hidden inside the kvm implementation of the VMHost capability.
The way I see it VirtualMachineHost capability trait should have a few methods like :
list_vms() -> Vec
ensure_vm(VirtualMachine)
delete_vm(VirtualMachine)
get_vm_info(VirtualMachine) -> VirtualMachineRuntimeInfo // ip address, network method, hypervisor name etc
I don't see what cannot work with kvm/virtualbox/vmware/proxmox/openstack right here. I know it is limited but it's fine. Most users don't care about the details, especially for the original intended use here of CI runners and development environments.
@@ -0,0 +43,4 @@}async fn ensure_ready(&self) -> Result<PreparationOutcome, PreparationError> {// The executor holds the URI — a cheap hypervisor-version query isThis is tricky and is actually an architectural problem in harmony. Ensure ready is executed eagerly on a topology, but this topology won't necessarily be running a kvm related workload this run and/or might be doing the kvm setup in an earlier task before calling the kvm dependent scores. I am pretty sure we have a ROADMAP entry on that topic of topology initialization dependency. Add a reference here to the roadmap entry with a TODO so we capture this use case when we get to this work.
@@ -0,0 +66,4 @@/// if a domain with this name already exists.#[derive(Debug, Clone, Serialize, Deserialize)]pub struct KvmVmScore {pub config: CloudInitVmConfig,This feels straight up wrong. KvmVmScore (sounds generic) depending on CloudInit (specific) is a crime against good architecture and naming.
@@ -0,0 +133,4 @@tokio::fs::create_dir_all(&cfg.seed_output_dir).await.map_err(|e| InterpretError::new(format!("create seed dir: {e}")))?;let status = Command::new("qemu-img")Use args vec, more readable. Also true for all other Command::new calls.
@@ -0,0 +9,4 @@//! `podman-api` over shelling to `podman` elsewhere — use the mature//! upstream where it's mature (apt/systemd/user module idempotency),//! don't adopt its orchestration model (playbooks, inventory, YAML//! templating, the Kubespray mess).Don't insult other projects, kubespray is cool and suits a purpose.
@@ -0,0 +58,4 @@// keeps re-runs cheap: the update is skipped if the cache was// refreshed within the last hour.//// When we grow RHEL-family support, switch on the distroThis comment does not feeld correct. Ensure_package is distribution agnostic. For now we choose to support only debian as this is our first concrete target, but this may change soon and the encapsulation is correct, choosing the correct tool based on the distribution is this function's burden. We might move that to the topology and separate the topologies more granularly between debian an rhel and others but at this moment I think this would be wrong.
There is complexity involved though, as very often (most of the time?) packages providing a utility have different names and ideas depending on the distribution family. For example installing qemu + kvm tooling is different package names on redhat family than debian family than arch family. We could easily provide a nice cross distribution score to install the qemu/kvm dependencies. That would probably be cleanest. So right here we have to work towards that. And I confirm using ansible to perform the actual installation is correct as we leverage ansible's strenght in low level module idempotency.
@@ -0,0 +108,4 @@host: IpAddress,creds: &SshCredentials,spec: &FileSpec,) -> Result<ChangeReport, ExecutorError> {I'd like to see here a proper rust struct for the file module config built as proper rust code and simply serialized to the ansible module format instead of a bunch of
json!invocations feeling very fragile. We should research if there already exist ansible rust crates. I doubt we will find mature and solid ones but it's worth looking we could be surprised.The file spec is probably the correct one, all that is lacking is a function to serialize it directly to ansible file module json.
@@ -0,0 +170,4 @@spec: &SystemdUnitSpec,) -> Result<ChangeReport, ExecutorError> {// Step 1: write the unit file.let (unit_path, scope_user) = match &spec.scope {I was not convinced that we should use ansible for the file copy but here for the systemd unit I am sure that we want to use the ansible builtin systemd module https://docs.ansible.com/projects/ansible/latest/collections/ansible/builtin/systemd_module.html
That note applies as well to the podman setup. It might make a lot of sense to use the podman ansible module to deploy the iot containers. https://docs.ansible.com/projects/ansible/latest/collections/containers/podman/index.html
This is also true for all other non trivial installation tasks. ansible is great at running commands and ensuring a package/service/file is installed correctly.
@@ -0,0 +256,4 @@.await}pub async fn ensure_linger(feels like ansible should have a purpose built module here too. Don't be lazy and call shell commands. Calling shell commands through ansible is completely useless as they're not truly idempotent in the way that file and package installations are.
@@ -0,0 +377,4 @@let mut cmd = Command::new(&bins.ansible);cmd.arg("all").arg("-i")args vec more readable
Structural changes (the biggest items from the review): - `HostConfigurationProvider` split into five narrower capabilities: `HostReachable`, `PackageInstaller`, `FileDelivery`, `UnixUserManager`, `SystemdManager`. Each implementation now only implements what it can actually deliver — a future cloud-init / ignition / podman-agent backend can pick a subset without inheriting systemd assumptions it can't honour. Added an umbrella trait `LinuxHostConfiguration` blanket-impl'd for any type that has all five, so Scores keep a single bound. - New `VirtualMachineHost` capability in domain/topology/: `list_vms` / `ensure_vm` / `delete_vm` / `get_vm_info`, with generic `VirtualMachineSpec` carrying a typed optional `VmFirstBootConfig` (hostname, admin user, authorized keys). `KvmHost` trait and `KvmHostTopology` deleted; `KvmVirtualMachineHost` is the concrete libvirt implementation. Cloud-init stays a KVM-impl detail — callers never see it. - `KvmVmScore` + `CloudInitVmConfig` deleted; replaced by a generic `ProvisionVmScore` in `modules::iot::vm_score` bound to `T: VirtualMachineHost`. The Score itself has no knowledge of the hypervisor or its first-boot delivery mechanism. - `IotDeviceSetupConfig.device_id` is now `harmony_types:🆔:Id` (timestamp-prefixed, sortable-by-creation, collision-safe). - `ensure_ready` on `KvmVirtualMachineHost` is a Noop with a TODO pointing at ROADMAP/12-code-review-april-2026.md §12.1 (phased topology). Captures the concern about eagerly probing the hypervisor even when the current run doesn't need KVM. Code quality fixes from the line-level comments: - `render_toml` / `render_systemd_unit` / `render_user_data` rewritten as `format!` with raw-string templates (no more push_str chains). - Every `Command::new(…).arg().arg().arg()` chain in the touched files converted to `.args([…])`. - Ansible module args are now typed Rust structs (`AptArgs`, `AnsibleFileArgs`, `AnsibleUserArgs`, `AnsibleCopyArgs`, `AnsibleSystemdArgs`, `AnsibleCommandArgs`, `AnsibleStatArgs`) serialized via `serde_json::to_value`. No more `json!` macros with ad-hoc string keys. - `ensure_linger`: no more shell sentinel. Uses `ansible.builtin.stat` on `/var/lib/systemd/linger/<user>` for the idempotent change-state check, then `ansible.builtin.command loginctl enable-linger` only on miss. `loginctl` is required (not just `file state=touch`) because systemd-logind needs the dbus signal to actually start the user manager; a plain file touch doesn't wake it up and every subsequent `systemctl --user …` fails with "Failed to connect to bus". Documented in-place. - `ensure_user_unit_active`: picks up the user's UID first via `ansible.builtin.command id -u <user>` and wraps the `systemctl --user enable --now <unit>` invocation in `env XDG_RUNTIME_DIR=/run/user/<UID>`. The systemd module's task-level `environment:` keyword isn't available in ad-hoc mode; this is the cleanest equivalent. Documented the inline-playbook path as a future when we get more task-level- env callsites. - `ensure_package` comment clarified: distro dispatch is this function's job; Debian-family is the first concrete target and extending to RHEL/Fedora/Alpine is an implementation detail, not a capability change. - Kubespray line removed. Verified: from a primed `$HARMONY_DATA_DIR/iot/`, smoke-a3.sh still completes all 5 phases (bootstrap + provision + 9 setup changes + initial NATS status + power-cycle recovery).Ansible's `command` module is a Python-wrapped SSH round trip with zero added value when the operation isn't built around Ansible's idempotency primitives. `russh` is already a workspace dep and gives us the exit code + stdout + stderr in a typed struct, with one round trip. Moving the two call sites that were using `ansible.builtin.command` to russh directly: - New `modules::linux::ssh_executor::ssh_exec(host, creds, cmd)` returning `SshCommandOutput { rc, stdout, stderr }`. Loads the private key via `russh::keys::load_secret_key`, authenticates, opens an exec channel, drains all `ChannelMsg` until the channel closes, returns the collected data. Draining past `Eof` matters: some sshd implementations emit `ExitStatus` *after* `Eof`, and an early break loses the rc. - `ensure_linger`: `test -e /var/lib/systemd/linger/<user>` over russh for the check, then `sudo loginctl enable-linger <user>` only on miss. Two SSH round trips, no Ansible. Same semantics as the previous `stat` + `command` pair but without the Python hop. - `ensure_user_unit_active`: `id -u <user>` + `sudo -u <user> env XDG_RUNTIME_DIR=/run/user/<uid> systemctl --user enable --now <unit>`. This is the case that couldn't be done cleanly via ad-hoc `ansible.builtin.systemd` in the first place because task-level `environment:` isn't available in ad-hoc; russh makes it a one-liner. Ansible still owns: `apt` (distro dispatch + cache), `user` (idempotent account management), `copy` (file delivery with content-diff change reporting), `file` (directory/mode), `systemd` (daemon-reload + enable + start as one atomic call). Those are where `ansible`'s value is real; `command` was a category error. Verified: smoke-a3 PASS end-to-end — same 9-change initial setup, NATS status, and power-cycle recovery as before.Adds the type-safe arch dimension for the aarch64-on-x86_64 emulation work to follow. No behaviour change: every existing call site gets `VmArchitecture::X86_64` via `Default`, and the XML renderer (unchanged in this commit) emits the same bytes it always did. - `VmArchitecture { X86_64 (default), Aarch64 }` in domain/topology/virtualization.rs, with `as_str()` and `ubuntu_cloudimg_suffix()` helpers (Ubuntu uses `amd64`/`arm64` in filenames, not the `uname -m` spelling). - `VirtualMachineSpec.architecture` + `#[serde(default)]` for on-disk compat. - `VmConfig.architecture` + `VmConfig.firmware: Option<UefiFirmware>` in modules/kvm/types.rs. `UefiFirmware { code, vars }` is the typed pair libvirt's `<loader>` + `<nvram>` need for aarch64 guests; x86_64 leaves it None. `VmConfigBuilder::architecture()` / `firmware()` setters added. - `KvmVirtualMachineHost::ensure_vm` threads the arch through to VmConfig; firmware wiring is commit 3. Re-exported: `VmArchitecture`, `UefiFirmware` from `modules::kvm`. `VmArchitecture` is a type-alias re-export from domain/topology so the arch enum lives in one place. Verified: cargo check clean, fmt clean, aarch64 cross-compile of harmony + iot crates still green.aarch64 guests boot via UEFI — there is no SeaBIOS equivalent for the arm64 `virt` machine type. Libvirt needs two paths: - CODE (read-only firmware image, shared across VMs) - VARS (writable NVRAM, per-VM) Every distro ships these under a different filename. New module `modules/kvm/firmware.rs`: - `AarchFirmware { code, vars_template }` — typed pair. - `discover_aarch64_firmware()` walks four known-paths groups (Arch `edk2-armvirt`, Arch old naming, Debian/Ubuntu `qemu-efi-aarch64`, Fedora `edk2-aarch64`). First pair where both files exist wins. Miss → `ExecutorError` carrying the per-distro `pacman`/`apt`/`dnf` install command + the full candidate list for diagnosis. - `copy_vars_template_for_vm(fw, dest)` produces the per-VM NVRAM at `$pool/<vm>-VARS.fd` and chmods 0644 so libvirt-qemu's dynamic-ownership chown on VM start works. Wired into `KvmVirtualMachineHost::ensure_vm`: when `spec.architecture == Aarch64`, the topology runs firmware discovery + per-VM copy before composing the `VmConfig`, then hands the resolved `UefiFirmware` to the XML renderer (commit 2 already consumes it). x86_64 path unchanged. Firmware discovery is deliberately a runtime check with a clear error, not a preflight — this lets x86_64-only runs succeed on hosts without AAVMF installed. Commit 4 adds an arch-aware preflight that surfaces it upfront when a caller asks for aarch64. Verified: 26/26 kvm::xml tests still green, cargo check clean, cargo fmt clean.Wire the VmArchitecture story all the way to the user-facing entry points so an arm64 smoke run is a single env flip. Example (`example_iot_vm_setup`): * New `--arch {x86-64|aarch64}` flag (default x86-64) backed by a `CliArch` enum that converts cleanly to `VmArchitecture`. * Preflight and cloud-image bootstrap now call the `_for_arch` variants, and the `VirtualMachineSpec.architecture` field gets the real value instead of `Default::default()`. Smoke script (`iot/scripts/smoke-a3.sh`): * Reads `ARCH=x86-64|aarch64` from env (default x86-64). * When `ARCH=aarch64`, `rustup target add aarch64-unknown-linux-gnu` + `cargo build --target ...` produces an arm64 agent binary; otherwise the existing host-target build path is kept. * Threads `--arch` to the example. * Extends the phase-4 initial-status timeout (60s → 300s) and the phase-5 post-reboot wait (240s → 900s) under TCG, which runs 3-5× slower than native KVM. New `smoke-a3-arm.sh` wrapper: exports `ARCH=aarch64` and a separate `VM_NAME` / NATS container name so an arm smoke run can coexist with an x86 one on the same host without stepping on libvirt state. Topology side (`KvmVirtualMachineHost::ensure_vm`): `wait_for_ip` timeout is now arch-derived — 300s for x86_64, 900s for aarch64 — because first-boot cloud-init under TCG routinely needs 8-12 min on a constrained worker.The on-device agent builds `harmony` with `default-features = false, features = ["podman"]`, which does not pull in the `kvm` feature. Cross-compiling iot-agent-v0 for `aarch64-unknown-linux-gnu` to put it on a Pi / arm64 VM currently fails with: error[E0433]: failed to resolve: could not find `kvm` in `modules` --> harmony/src/modules/iot/preflight.rs:18:21 use crate::modules::kvm::firmware::discover_aarch64_firmware; Gate the import and the `discover_aarch64_firmware()` call inside `check_iot_smoke_preflight_for_arch` behind `#[cfg(feature = "kvm")]`. Callers who build `harmony` without kvm (the agent) still get the `qemu-system-aarch64` PATH check — the firmware probe only matters to the host that will actually boot the VM, and that host always builds with `kvm` enabled anyway. Verification: `cargo build --release --target aarch64-unknown-linux-gnu -p iot-agent-v0` now succeeds and produces a valid ELF aarch64 binary (~13 MB).Current Arch edk2-armvirt ships the pair as /usr/share/edk2/aarch64/QEMU_EFI.fd /usr/share/edk2/aarch64/QEMU_VARS.fd (plus a compatibility copy under /usr/share/edk2-armvirt/aarch64/). The previous CANDIDATES list looked for `QEMU_CODE.fd` and `vars-template-pflash.raw` — neither name matches the actual distro layout, so `discover_aarch64_firmware` reported "no firmware found" on a fully-provisioned Arch host. Add the `QEMU_EFI.fd` + `QEMU_VARS.fd` pair at both Arch paths at the top of the probe order; keep the older raw-pflash variant and the speculative CODE/VARS naming as later fallbacks. Sync the error message's "checked paths" hint with the new list so the diagnostic matches what's actually probed. Verified against /usr/share/edk2/aarch64/QEMU_{EFI,VARS}.fd on this host — `discover_aarch64_firmware` now returns the pair and `cargo run -p example_iot_vm_setup -- --arch aarch64 --bootstrap-only` completes (downloads + sha256-verifies the 598 MB arm64 image and caches it under $HARMONY_DATA_DIR/iot/cloud-images/).Three fixes landed during arm smoke debugging. Each is a real correctness / perf issue that would bite anyone running aarch64 under TCG via libvirt, independent of any particular firmware. **xml.rs — qemu:commandline overrides for -cpu and -accel** `pauth-impdef=on` is a QEMU property of `-cpu max`, not a libvirt `<feature>` entry. Putting it under `<cpu><feature policy='require' name='pauth-impdef'/>` is rejected by libvirt with: error: unsupported configuration: unknown CPU feature: pauth-impdef Route it instead via `<qemu:commandline>` (with the qemu namespace declared on `<domain>`). QEMU takes the LAST `-cpu` arg as authoritative, so libvirt's `-cpu max` followed by our `-cpu max,pauth-impdef=on` yields max + pauth-impdef. Same mechanism forces MTTCG: despite docs claiming QEMU ≥ 9.1 defaults to `thread=multi` on aarch64, observation on QEMU 10.2 shows cross-arch `-accel tcg` runs single-threaded (`vcpu.1.time` stays at 0 forever). Appending `-accel tcg,thread=multi` creates a real per-vcpu thread and roughly halves cold-boot wall time. Also added a `<rng model='virtio'>` device feeding host `/dev/urandom`. aarch64 cloud-init blocks minutes on first-boot SSH host-key generation without it under TCG (entropy pool never fills on its own). Cheap insurance on x86_64 too. **topology.rs — 30-min wait_for_ip budget for aarch64** Cold boot under TCG on an 8-core x86 host is 10-15 min even with virtio-rng + pauth-impdef + MTTCG. The previous 900s ceiling trips healthy boots; 1800s covers slower CI workers. **smoke-a3.sh — cleanup must pass --nvram** `virsh undefine --remove-all-storage` refuses to remove an aarch64 domain without `--nvram`, because NVRAM files aren't considered "storage." Before this, a failed run left the domain definition behind with yesterday's XML — subsequent runs would replay the stale XML (ensure_vm is idempotent and doesn't redefine when the domain already exists), masking any XML change until a manual `virsh undefine` was issued. Also bump REBOOT_STEPS to match the new topology-side budget. Verified: `cargo test -p harmony --lib kvm::xml` passes (26/26), including the 5 aarch64 assertions (namespace, cpu block, pflash wiring, qemu:commandline contents for both -cpu and -accel).QEMU's `virt` machine hardwires pflash unit 0 as a CFI flash device of fixed size 64 MiB. When libvirt's `<loader type='pflash'>` points at a file smaller than that, qemu refuses to start: cfi.pflash01 device '/machine/virt.flash0' requires 67108864 bytes, block backend provides 3145728 bytes Different distros ship the CODE firmware differently: - Pre-padded (upstream QEMU pc-bios/edk2-aarch64-code.fd, Debian/ Ubuntu qemu-efi-aarch64): file is exactly 64 MiB, zero-padded at the tail. Works as-is with libvirt's pflash loader. - Raw edk2 build output (Arch `edk2-aarch64 202508+`): file is ~2-4 MiB, just the firmware volume without pflash padding. Has to be padded before libvirt accepts it. Our discovery previously handed the discovered path straight to libvirt. That works on pre-padded distros and silently fails on raw-output distros. Add `ensure_code_pflash_padded` in modules/kvm/firmware.rs: - If the source is already 64 MiB, return the path unchanged — no copy, no bytes moved. - If smaller, check a cache path (pool_dir/aarch64-code-padded.fd) for a correctly-sized copy newer than the source and reuse it. - Otherwise copy + `File::set_len(64 MiB)` (sparse zero pad, one syscall), chmod 0644, return the cached path. - If larger than 64 MiB, error out — no amount of padding saves us. `ensure_vm_firmware` in topology.rs now runs the discovered code through the padder before handing it to libvirt. One padded copy per pool, reused across every aarch64 VM on that pool. Verification path: `cargo test -p harmony --lib kvm::` passes (26 tests — XML suite unchanged since this is runtime-only).`wait_for_ip` returns as soon as libvirt sees a DHCP lease, but the guest may still be minutes away from accepting SSH connections — cloud-init is usually mid-firstboot (SSH host-key generation, runcmd, etc.). Any Score that SSHes in immediately after `ensure_vm` resolves races with sshd startup: ansible.builtin.ping failed against 192.168.122.11: UNREACHABLE! ssh: connect to host 192.168.122.11 port 22: Connection refused This is painful on native KVM (seconds) and catastrophic under TCG (1-3 min between DHCP and sshd listening). When `spec.first_boot.is_some()` — i.e. the caller asked us to run cloud-init and therefore almost certainly intends to SSH next — also block on `wait_for_tcp_port(ip, 22, budget)` before returning. The budget is reused from `wait_for_ip` (300 s x86_64 / 1800 s aarch64) because if cloud-init takes that long to bring SSH up, something is broken that a longer wait wouldn't fix. `wait_for_tcp_port` uses 1 s backoff polling with a 5 s per-attempt TCP connect timeout, so a silently dropped SYN doesn't burn half the budget on a single hung syscall. Cases without `first_boot` (caller bringing their own pre-baked image and not expecting SSH) get the old behavior: return as soon as DHCP resolves.@@ -0,0 +1,71 @@apiVersion: apiextensions.k8s.io/v1never write yaml. This must be typed rust. kube-rs provides all we need to declare a fully typed crd. Same goes for operator.yaml file.
@@ -0,0 +78,4 @@};obj.extensions.insert("x-kubernetes-validations".to_string(),serde_json::json!([{Why is this not part of the full crd? Why add a bit more json into it after the fact? Am I missing something?
Consolidate the data types, NATS bucket names, and KV key formats that were scattered across the IoT operator, on-device agent, and harmony's podman module. Each was defined in one place and quoted / reimplemented in the others, which is exactly the kind of contract drift the roadmap v0.1 §2 called for consolidating before we start layering new features on top. New crate `iot/iot-contracts`: * score.rs — `IotScore`, `PodmanV0Score`, `PodmanService` (moved from `harmony::modules::podman::score`). Pure data, no harmony deps. * kv.rs — `BUCKET_DESIRED_STATE`, `BUCKET_AGENT_STATUS` constants, `desired_state_key(device, deployment)`, `status_key(device)`. These values used to be hard-coded in five places (agent main.rs, operator main.rs, operator/deploy/operator.yaml, smoke-a1.sh, smoke-a3.sh). Tests lock the literals so a flip can't slip. * status.rs — typed `AgentStatus { device_id, status, timestamp }`. Replaces the anonymous `serde_json::json!{}` the agent was publishing, so the operator can deserialize the heartbeat payload via a shared struct when §12 v0.1 status aggregation lands. Consumer updates: * `harmony::modules::podman::score` now holds only the `Score<T>` / `Interpret<T>` trait bindings; the pure types are re-exported from iot-contracts. Trait impls can't move because the trait lives in harmony, so this is the cleanest split. * `iot-operator-v0` uses `BUCKET_DESIRED_STATE` and `desired_state_key` — the inline `kv_key` fn now delegates so the existing internal call sites stay untouched. * `iot-agent-v0` uses `BUCKET_DESIRED_STATE`, `BUCKET_AGENT_STATUS`, `status_key`, and `AgentStatus` for the heartbeat publish. No behavior change. Tests: `cargo test -p iot-contracts` passes (8/8). Regression: `smoke-a3.sh` on x86_64 PASSes end-to-end (reboot-reconnect loop included) — wire format is byte-identical to the pre-refactor serialization. Next consumers on deck: operator-side status aggregation (§12 v0.1 #3) and journald log streaming (§12 v0.1 #5), both of which need shared types across the operator/agent boundary and were the reason this extraction was prioritized.Replaces an 8-link `.arg("-t").arg("ed25519").arg("-N")…` chain with a single `.args([...])` of string literals, plus one trailing `.arg()` for the `&PathBuf` (kept separate so we don't force it through the `IntoIterator<Item=&str>` channel). No behavior change.`AgentStatus.device_id` and `AgentStatus.timestamp` were stringly typed. Both now carry real types that prevent a whole class of wire-format typos while keeping the on-wire JSON shape intact. **device_id: String → harmony_types:🆔:Id** Agent config + heartbeat payload now share the same `Id` that the example IoT pipeline already uses for `IotDeviceSetupConfig`. Mixing a device id with a deployment name or arbitrary `String` is now a type error. `Id` is re-exported from `iot-contracts` so consumers don't need a direct `harmony_types` dependency just to name the field. To keep the wire format byte-compatible, `harmony_types::Id` gains `#[serde(transparent)]`. Audit: no consumer in the tree relies on the previous `{"value": "…"}` shape — `Id` is persisted by sqlite via `to_string()`, never serialized directly — so this is a latent-bug fix more than a behavior change. **timestamp: String → chrono::DateTime<Utc>** The agent was calling `chrono::Utc::now().to_rfc3339()` and stuffing the String into the payload. It now holds a real `DateTime<Utc>` which serde-serializes as RFC 3339 anyway. The smoke script's reboot-gate lex comparison still works: time-digit prefixes resolve before the trailing `Z` (chrono default) vs `+00:00` (prior format) difference matters. **Plumbing** - `iot/iot-agent-v0/src/config.rs`: `AgentSection.device_id: Id`. TOML deserializes the bare string thanks to `#[serde(transparent)]`. - `iot/iot-agent-v0/src/main.rs`: `watch_desired_state` and `report_status` take `Id` instead of `String`. - `iot/iot-contracts/Cargo.toml`: adds `harmony_types` path dep and `chrono = { workspace, features = ["serde"] }`. **Verification** - `cargo test -p iot-contracts`: 8/8 passes. New assertions pin the wire format: `"device_id":"pi-01"` (not `{"value":"pi-01"}`) and `"timestamp":"2026-04-21T18:15:42Z"` (RFC 3339). - x86_64 smoke-a3.sh PASSes end-to-end including the reboot- reconnect loop — wire format remains compatible with the existing smoke-script parsing.v0 walking skeleton is substantially done (CRD → operator → NATS KV → on-device agent → podman reconcile; VM-as-device for x86_64 and aarch64 via TCG; power-cycle resilience; operator install via Score instead of yaml/kubectl). Time to switch the `ROADMAP/iot_platform` folder from "plan to build the skeleton" to "plan to build on top of the skeleton." - **NEW** `ROADMAP/iot_platform/v0_1_plan.md` — the authoritative forward plan. Five chapters in execution order: 1. Hands-on end-to-end demo the user can drive by hand (imminent, fully detailed: composed smoke, typed-Rust CR applier, natsbox command menu, in-cluster NATS). 2. Status reflect-back + inventory (enrich `AgentStatus`, operator aggregates into `.status.aggregate`). 3. Helm chart packaging (ArgoCD deferred — user's clusters have it already, bringing it into the smoke adds no validation value). 4. Zitadel + OpenBao + per-device auth. 5. Frontend (web / CLI / TUI — deferred). Chapters 2-5 are sketched; they expand to their own docs as each becomes the active chapter. - **EDIT** `ROADMAP/iot_platform/v0_walking_skeleton.md` — add a SHIPPED banner at the top pointing at v0_1_plan.md. Keep the 707-line design diary intact as archaeology; don't rewrite history. - Incorporates the post-v0 architectural principles that emerged from review (no yaml in framework paths, minimal ad-hoc topologies, cross-boundary types in harmony-reconciler-contracts, verify before blaming upstream).Roadmap §12.6 ("topology proliferation") is partially resolved by extracting the ad-hoc InstallTopology from iot-operator-v0/install.rs into harmony as a reusable shared type, now that a second consumer (NatsBasicScore, landing next) makes the extraction genuinely load-bearing rather than speculative. What's new: - harmony/src/modules/k8s/bare_topology.rs — K8sBareTopology carries one K8sClient, implements K8sclient + Topology (noop ensure_ready). Constructors: from_client(name, client) for callers building their own client, from_kubeconfig(name) for callers reading the standard KUBECONFIG chain. - modules::k8s::K8sBareTopology re-export. What's gone: - iot-operator-v0/src/install.rs: the ~30-line InstallTopology struct + its async_trait-decorated impls. The crate also drops async-trait and harmony-k8s as direct deps (neither is used now that the topology is shared). - Long "architectural smell" comment from install.rs — the smell is fixed; the explanation belongs at the shared type now (with the history captured in its module doc). Behavior-preserving. cargo check --all-targets --all-features clean. smoke-a1 wire path unchanged. Compounding-value move: every future Score that needs "apply a typed resource against an existing cluster" consumes K8sBareTopology instead of inventing its own Topology impl. That's the pattern v0 Harmony's design is meant to encourage.Harmony's existing NATS story starts at `NatsK8sScore`, which is designed for production multi-site superclusters: TLS-fronted gateways, cert-manager-minted certs, ingress + Route, helm chart with gateway merge blocks, NatsAdmin secret prompts. All of that is overhead for a local smoke or a single-site decentralized deployment that just needs a live JetStream server. Add `NatsBasicScore` beside it. Deliberately minimal: - Single replica - Official `nats:*-alpine` image via typed k8s_openapi Deployment - JetStream (-js) on by default, toggle via builder setter - Namespace created if missing - Service: ClusterIP by default, or NodePort via `.node_port(port)` for off-cluster clients (e.g. a libvirt VM connecting through the host's loadbalancer port) Trait bounds are just `Topology + K8sclient` — no `HelmCommand`, no `TlsRouter`, no `Nats` capability. Composes cleanly with `K8sBareTopology` (added in the previous commit) so consumers can `score.create_interpret().execute(&inventory, &topology)` against any cluster `KUBECONFIG` points at. Constructed via a small builder: NatsBasicScore::new("iot-nats", "iot-system") .node_port(4222) .jetstream(true) Under the hood the interpret runs three `K8sResourceScore`s in sequence (namespace → deployment → service). No new machinery — just composition of existing primitives. Deliberately NOT in scope for this Score: - TLS / PKI — use NatsK8sScore when you need those - Gateways / supercluster — use NatsSuperclusterScore - Auth (user/password or JWT) — add a ConfigMap mount when the Chapter 4 auth work lands Tests (4, all passing): default is ClusterIP; node_port() flips Service to NodePort with the right nodePort field; jetstream() toggle controls the `-js` arg. Part of the "compound framework value" mindset: every future Score that wants a local NATS now points at this one type instead of inventing its own yaml.Replaces what would otherwise be a yaml fixture for the hands-on demo. The CRD is already fully typed (DeploymentSpec + ScorePayload + PodmanV0Score + Rollout), so the applier uses those types directly, constructs the CR via kube::Api, and either applies it server-side or prints the JSON for `kubectl apply -f -`. CLI: iot_apply_deployment \ --namespace iot-demo \ --name hello-world \ --target-device iot-smoke-vm \ --image docker.io/library/nginx:latest \ --port 8080:80 # apply iot_apply_deployment --image nginx:1.26 # upgrade (same name, new img) iot_apply_deployment --delete # tear down iot_apply_deployment --print ... # JSON to stdout → kubectl -f - Uses server-side apply (PatchParams::apply().force()) so repeated invocations patch the existing CR cleanly — the upgrade path the demo exercises. To expose the CRD types to an external consumer, iot-operator-v0 gains a thin `src/lib.rs` that re-exports the `crd` module. The binary target now imports from the library (`use iot_operator_v0::crd;`) instead of declaring its own `mod crd;` — avoids compiling the types twice. No change in operator runtime behavior. Part of the ROADMAP/iot_platform/v0_1_plan.md Chapter 1 work.Small CLI that installs a single-node NATS server into the cluster KUBECONFIG points at, using harmony's `NatsBasicScore` composed against `K8sBareTopology`. This is the glue between `smoke-a4.sh` and the framework Score: cargo run -q -p example_iot_nats_install -- \ --namespace iot-system \ --name iot-nats \ --node-port 4222 Defaults cover the demo exactly: iot-system namespace, NodePort 4222 so the libvirt VM agent can reach NATS through the k3d loadbalancer port mapping. No reinvented topology, no hand-rolled yaml, no helm shell-out. The actual work (Namespace + Deployment + Service with the right selector/ports/probes) lives inside `NatsBasicScore::Interpret` in harmony where it can be reused by any future consumer. Part of ROADMAP/iot_platform/v0_1_plan.md Chapter 1.Composed demo that brings up operator + in-cluster NATS + ARM (or x86) VM agent, then either hands the full stack off to the user with a command menu (default) or drives an apply + upgrade + delete regression loop (`--auto`). Phases: 1. k3d cluster with NATS port exposed via `-p 4222:4222@loadbalancer`. 2. NATS in-cluster via the new `example_iot_nats_install` binary → `NatsBasicScore` → typed k8s_openapi Namespace + Deployment + NodePort Service. 3. CRD install via `iot-operator-v0 install` (Score-based, no yaml). 4. Operator spawned host-side, connects to nats://localhost:4222. 5. VM provisioned via `example_iot_vm_setup` (reused from smoke-a3); agent inside the VM connects to nats://<libvirt-gateway>:4222. 6. Sanity: NATS pod Running, agent heartbeat `status.<device>` present in `agent-status` bucket. 7a. DEFAULT: print a command menu (kubectl watch, typed Rust applier, ssh/console, natsbox one-liners, curl) and block on Ctrl-C with a cleanup trap tearing everything down. 7b. `--auto`: apply nginx:latest, wait for container on the VM, curl, upgrade to nginx:1.26, assert container id CHANGED, curl, delete, assert container gone. Prereqs documented at the top of the script. Handles both x86-64 (native KVM) and aarch64 (TCG emulation) via `ARCH=` env. Design notes captured in ROADMAP/iot_platform/v0_1_plan.md. Uses every piece landed in this branch so far: K8sBareTopology, NatsBasicScore, the typed CR applier, the Score-based CRD install.Kubernetes NodePort Services must use a port in the apiserver's configured nodeport range (default 30000-32767). NatsBasicScore's first cut accepted any port via `.node_port(port)`, which was fine for strict use of the capital-N NodePort Service type, but made the demo's "use NATS client port 4222 directly from the host" story awkward. Replace the `node_port: Option<i32>` field with a proper `NatsServiceType` enum (ClusterIP | NodePort(i32) | LoadBalancer). Three builder methods — one per variant. LoadBalancer is the right idiom for the demo: k3d's built-in `klipper-lb` fronts LoadBalancer Services on their `port` (not their nodePort), so `k3d cluster create -p 4222:4222@loadbalancer` delivers external traffic straight to the Service's client port. No nodeport range juggling. Signatures: NatsBasicScore::new(name, namespace) // ClusterIP default .node_port(30422) // NodePort(30422) .load_balancer() // LoadBalancer .jetstream(true) .image("docker.io/library/nats:2.10-alpine") Tests: 5 pass. New assertion: `load_balancer()` produces a Service with type LoadBalancer and no pinned nodePort (apiserver assigns). Consumers: - `example_iot_nats_install` gets a `--expose {cluster-ip | node-port | load-balancer}` flag (default `load-balancer` since that's what the demo wants). The legacy `--node-port N` flag survives as the NodePort port value. - `smoke-a4.sh` asks for `--expose load-balancer`, matching its `-p 4222:4222@loadbalancer` k3d port mapping.Ubuntu 24.04 `useradd --system` does not allocate `/etc/subuid` + `/etc/subgid` ranges. Rootless podman silently fails on image-layer unpack: potentially insufficient UIDs or GIDs available in user namespace (requested 0:42 for /etc/gshadow): ... lchown /etc/gshadow: invalid argument `smoke-a1.sh` didn't hit this because it runs the agent on the *host* user, which has subuid/subgid populated by default. `smoke-a4.sh` drives a podman pull inside the VM — the FIRST time we actually exercise rootless-podman-on-a-fresh-system, and the failure surfaces immediately. The fix belongs in harmony, not in ad-hoc cloud-init scripts. Add `UnixUserManager::ensure_subordinate_ids` alongside the existing `ensure_user` + `ensure_linger` methods: - `domain/topology/host_configuration.rs`: new trait method. Doc explains why every rootless-container-runtime consumer needs it. - `modules/linux/ansible_configurator.rs`: impl follows `ensure_linger`'s pattern — a grep probe on /etc/subuid+/etc/subgid, then a single `usermod --add-subuids 100000-165535 --add-subgids 100000-165535` only when missing. Idempotent, no-ops on re-run. - `modules/linux/topology.rs`: forwarder for `LinuxHostTopology`. - `modules/iot/setup_score.rs`: call the new method right after `ensure_linger` in `IotDeviceSetupScore`. Any future consumer that runs rootless podman reaches for the same primitive. Verified: `cargo check --all-features` clean. End-to-end smoke-a4 regression pending (re-running after this commit).Docker Hub's unauthenticated rate limit (100 pulls per 6h per IP, counted per-manifest-query) is the most reliable way for a CI-style smoke loop to produce false negatives. The NATS pod failing with '429 Too Many Requests' after a handful of runs today was that — not a real regression. Fix inside the smoke: before running the install Score, sideload the NATS image into the k3d cluster via a podman→docker→k3d bridge: - If the image isn't already in docker's store: - If it's not in podman's store either, podman pull (this is the one-time hit we can't avoid). - podman save → docker load. - k3d image import into the cluster's containerd. Steady-state this is a few-hundred-ms operation (no Hub calls, no registry traffic). Require docker in the preflight list since we depend on it for the cross-runtime bridge. Also bump the Available-wait from 60 s to 120 s — the post-import pod spin-up is fast but the scheduler + loadbalancer update take longer than I initially budgeted. VM-side nginx pulls are still at Hub's mercy; addressing that requires either (a) docker login before the smoke, (b) an authenticated registry mirror, or (c) arch-specific image pre-seeding into the VM. All Chapter-2+ follow-ups.Chapter 2 groundwork. The on-wire AgentStatus the agent publishes every 30 s was only carrying device_id + status + timestamp — not enough for the operator to answer "how are my deployments doing." Enrich it so the operator can aggregate into a useful DeploymentStatus.aggregate subtree on the CR (second commit). **harmony-reconciler-contracts/src/status.rs** - `AgentStatus.deployments: BTreeMap<String, DeploymentPhase>` — keyed by deployment name (CR's metadata.name). Each phase carries `{ phase: Running|Failed|Pending, last_event_at, last_error }`. - `AgentStatus.recent_events: Vec<EventEntry>` — ring buffer of the most recent reconcile events on this device. Each entry is `{ at, severity: Info|Warn|Error, message, deployment: Option }`. Bounded agent-side to keep JetStream per-message size sane. - `AgentStatus.inventory: Option<InventorySnapshot>` — hostname, arch, os, kernel, cpu_cores, memory_mb, agent_version. Published once on startup. - All three new fields are `#[serde(default)]` — mixed-fleet upgrades don't break: an old agent's payload deserializes into the new struct (deployments empty, events empty, inventory None); a new agent's payload deserializes into an old operator just losing the fields. New tests (kept forward-compat front and center): - `minimal_status_roundtrip` — empty maps / None - `enriched_status_roundtrip` — full population - `old_wire_format_parses_into_enriched_struct` — pre-Chapter-2 payload must still parse (the upgrade guarantee) - `wire_keys_present` — literal wire-format pins for smoke greps **iot-agent-v0** Reconciler gains a `StatusState { deployments, recent_events }` side map with a bounded ring buffer (`EVENT_RING_CAP = 32`). Every code path that changes deployment state now also records phase + event: - `apply()`: Pending → Running on success, Failed + error event on failure. - `remove()`: drops phase, emits "deployment deleted" info event. - `tick()` (periodic reconcile): keeps phase at Running on noop; flips to Failed + event on error (deliberately no event on successful no-change ticks — 30 s cadence would drown the ring). New helper `deployment_from_key(key)` unwraps `<device>.<deployment>` into just the deployment name. `short(s)` truncates error strings to 512 chars so the payload stays well under NATS JetStream limits. `report_status()` in main.rs now snapshots the reconciler's status state on every heartbeat and publishes the full enriched payload alongside a startup-captured InventorySnapshot. Inventory reads `/proc/sys/kernel/osrelease` + `/proc/meminfo` + `std::env::consts::ARCH` with graceful fallbacks — no new sys-info crate dep. Verified: `cargo test -p harmony-reconciler-contracts --lib` 7/7 green (5 new). Operator consumption of the new fields lands in the next commit.The operator watches the \`agent-status\` bucket, keeps a per-device snapshot in memory, and folds it into each Deployment CR's \`.status.aggregate\` subtree every 5 seconds. The answer to the user's stated requirement — "CRD .status reflect-back: per-device succeeded/failed counts + recent log lines" — now lives in the CR itself, observable via \`kubectl get -o jsonpath\` or any UI that speaks k8s status subresources. **Shape (in iot/iot-operator-v0/src/crd.rs)** DeploymentStatus { observed_score_string, // unchanged; controller change-detect aggregate: Option<{ succeeded: u32, // devices with Phase::Running failed: u32, // devices with Phase::Failed pending: u32, // devices with Phase::Pending or // reported-but-no-phase-entry-yet unreported: u32, // target devices that never heartbeated last_error: Option<{ // most recent failing device + short msg device_id, message, at }>, recent_events: Vec<{ // last-N events across the fleet, newest first at, severity, device_id, message, deployment }>, last_heartbeat_at, // freshness signal for the whole fleet }> } **New module** \`iot/iot-operator-v0/src/aggregate.rs\` - \`watch_status_bucket\`: subscribes to \`status.>\` on the agent-status bucket, maintains a \`BTreeMap<device_id, AgentStatus>\` in memory. Malformed payloads + malformed keys log-and-skip; the snapshot map is always the latest good shape. - \`aggregate_loop\`: 5 s ticker. Per tick: list Deployment CRs, clone the snapshot (no lock held across network calls), compute each CR's aggregate, JSON-Merge-Patch \`.status.aggregate\`. Merge patch composes cleanly with the controller's \`observedScoreString\` patch — neither clobbers the other. - \`compute_aggregate\` pure fn: classification logic is in one place, four unit tests pin its behaviour (counts + unreported, reported-but-no-phase-entry = pending, event filter matches deployment name only, status-key parser). **Operator wiring** (\`main.rs\`) \`run()\` now opens *both* KV buckets at startup, spawns the controller and the aggregator concurrently via \`tokio::select!\`. Either returning an error tears the process down — kube-rs's Controller already absorbs transient reconcile errors internally, so anything escaping is genuinely fatal. **Controller tweak** The apply path's \`patch_status\` was rebuilding the whole \`DeploymentStatus\` struct, which would clobber the aggregator's writes. Switched to raw JSON-Merge-Patch for the \`observedScoreString\` field only. Behaviour preserved, aggregate subtree left intact. **Smoke assertion** (smoke-a4.sh --auto) After apply + curl succeeds, the --auto path now asserts \`kubectl get deployment.iot.nationtech.io ... -o jsonpath='{.status.aggregate.succeeded}'\` reaches 1 within 60 s. Proves the full agent → status bucket → operator aggregate → CRD status loop, end to end. Verified locally: \`cargo test -p iot-operator-v0 --lib\` 4/4 green, \`cargo check --all-targets --all-features\` clean.Two changes that compose into one win: the smoke no longer needs a functional Docker Hub to exercise the agent → podman → container loop. **harmony/src/modules/podman/topology.rs — IfNotPresent for image pull** `PodmanTopology::ensure_service_running` was calling `podman pull` on every reconcile, even when the image was already in the local store. For a long-lived device agent reconciling against a public registry, that's a guaranteed rate-limit collision: Docker Hub caps unauthenticated pulls at 100 manifests per 6 h per IP, and an agent ticking every 30 s chews through that allowance in a day. Change the pull path to check the local store first: if images.get(image).exists().await? { return Ok(()); } // else: pull Matches Kubernetes' `imagePullPolicy: IfNotPresent` semantics. Correct default for the IoT platform: upgrades change the image STRING (tag or digest), so they still hit the pull branch — "use local if available, pull the new thing if the reference changed." **iot/scripts/smoke-a4.sh — tarball sideload in place of registry** An earlier iteration of this smoke stood up a local `registry:2` container and pushed tagged images into it. That pattern itself needs to pull `registry:2` from Docker Hub — cute demo, still Hub-dependent. Gone now. New phase 4.5 / 5c pair: 4.5: podman save the cached `nginx:alpine` under two local tags (`localdev/nginx:v1`, `localdev/nginx:v2`) into a tarball on the host. 5c: scp the tarball to the VM, `podman load` it into the iot-agent user's rootless store. Paired with the new IfNotPresent semantics, the agent's reconcile sees both images already present and never touches a registry. The upgrade test still works because `v1` and `v2` are distinct tag strings → spec drift → container id changes. Dropped the `docker` preflight (no more k3d-side registry transfer) and the `LOCAL_REGISTRY_*` env vars. Verified end-to-end: x86 smoke-a4 --auto PASS. - apply v1 → container up → curl 200 - .status.aggregate.succeeded = 1 (Chapter 2 aggregator working) - apply v2 → container id changes (upgrade confirmed) - delete → container removed Aarch64 run next.push_str("…") → push('…'), and drop redundant .trim() before .split_whitespace() in /proc/meminfo parsing.First milestone of the aggregation rework. Lands the contract layer without any runtime side effects: the agent + operator still run their legacy paths unchanged. New types (module `fleet`): - DeviceInfo: routing labels + inventory, rewritten on label change. Stored in KV `device-info` at `info.<device_id>`. - DeploymentState: current phase per (device, deployment). Stored in KV `device-state` at `state.<device>.<deployment>`. Authoritative snapshot; operator rebuilds counters from it on cold-start. - HeartbeatPayload: tiny liveness ping in KV `device-heartbeat`. Payload capped by a test (< 96 bytes) so it stays cheap at 1M-device rates. - StateChangeEvent: `from: Option<Phase>, to: Phase, sequence` emitted on each transition to JS stream `device-state-events` on subject `events.state.<device>.<deployment>`. Operator folds these events into in-memory counters. - LogEvent: shorter-retention user-facing event log to JS stream `device-log-events` on subject `events.log.<device>`. Transport constants + key/subject helpers in `kv` with cross-component wire-stability tests so a rename here gets caught. 10 new tests (roundtrip serde, forward-compat parse, size bound, key/subject format). Legacy `AgentStatus` tests + constants stay green; retirement is scheduled for M8 once the live path has switched over.Agent now writes the new per-concern KV shapes + event streams alongside the legacy AgentStatus. Nothing consumes the new data yet — the legacy aggregator still drives CR .status from `agent-status`. M3 will add the operator-side cold-start + consumer paths in parity mode; M5 flips the CR-patch source once counters verify against the legacy aggregator. New module `fleet_publisher.rs` owns: - Opening + idempotent-creating the three new KV buckets (`device-info`, `device-state`, `device-heartbeat`) and two JetStream streams (`device-state-events`, `device-log-events`). - Publish methods for DeviceInfo, HeartbeatPayload, DeploymentState (KV put), StateChangeEvent + LogEvent (stream publish), and delete for deployment-state cleanup. - Log-and-swallow failure mode. The operator re-walks KV on cold-start, so a missed event publish is self-healing on the next transition or operator restart. Reconciler grew: - `device_id`: Id + `fleet`: Option<Arc<FleetPublisher>> - per-(deployment) monotonic sequence counter in StatusState - `set_phase` detects actual transitions (prev_phase vs new) and emits a DeploymentState KV write + StateChangeEvent stream publish only on change. No-op re-confirmation still bumps the sequence (lets operator detect duplicate events via sequence comparison) but stays off the wire. - `drop_phase` deletes the device-state KV entry. - `push_event` also publishes a LogEvent to the stream. main.rs: - Builds FleetPublisher after connect_nats, passes into Reconciler. - Publishes DeviceInfo once at startup (empty labels — populated by the selector-targeting branch once it merges). - Spawns a heartbeat loop on 30 s cadence. - Legacy `report_status` AgentStatus task kept running unchanged. 8 unit tests added for the transition-detection + sequence + ring- buffer invariants (drive set_phase / drop_phase / push_event with fleet: None). 18 contract tests from M1 still green.New module `fleet_aggregator` spawns a 5 s tick task that: - Walks the Chapter 4 KV buckets (`device-info`, `device-state`) every tick. - Computes per-CR phase counters via `compute_counters` (pure function, unit tested). - Computes the legacy aggregator's counts from the same `agent-status` snapshot map the legacy task is already maintaining. - Compares the two per CR and logs per-tick at DEBUG level (matches) or WARN (mismatches), with running totals at INFO every 60 s. Explicit `cr_targets_device` predicate is the one-line plug point for the selector-based rewrite coming from the review-fix branch: swap `target_devices.contains()` for `target_selector.matches(&info.labels)`, everything else in the aggregator is label/selector-agnostic. Refactored `aggregate::run` to accept the `StatusSnapshots` map from outside so the parity-check task reads the same agent-status view the legacy aggregator writes to. Added `aggregate::new_snapshots()` helper so `main` owns the one shared Arc. The task is strictly read-only: no CR patches, no side effects. M5 flips `.status.aggregate` over to the new counter-driven path once M4 replaces the periodic re-walk with the event-stream consumer and the parity check has stayed green under load. 5 unit tests cover the pure counter logic (target match, multi-CR fan-in, zero-target CR, phase dispatch).Replaces M3's per-tick KV re-walk with an incremental JetStream consumer on `device-state-events`. Cold-start still walks KV once to seed counters; steady state consumes events and applies `from -= 1; to += 1` diffs. New in `fleet_aggregator`: FleetState (shared via Arc<Mutex<_>>): - counters: per-deployment phase counts. - phase_of: per-(device, deployment) current phase, for duplicate + resync detection. - latest_sequence: per-(device, deployment) highest sequence applied, drops stale and duplicate deliveries. - deployment_namespace: name → namespace map refreshed each parity tick from the CR list (events carry only the deployment name, matching the `<device>.<deployment>` KV key format). apply_state_change_event(): - Idempotent for duplicate sequence numbers. - Idempotent for out-of-order lower-sequence events. - On from-phase disagreement with our belief, trusts the event and re-syncs (logs warn — parity check will catch any resulting drift against the legacy aggregator). - Counter decrement saturates at zero so replays can't underflow. run_event_consumer(): - Durable JetStream pull consumer on STATE_EVENT_WILDCARD, DeliverPolicy::New (cold-start already seeded state from KV — replaying from the beginning would double-count). - Explicit ack; malformed payloads are logged + acked to avoid infinite redelivery. parity_tick() no longer walks KV — it reads live counters from the shared FleetState and compares with the legacy aggregator's per-CR fold. Same match/mismatch/running-totals logging as M3. 8 new unit tests cover the event-apply invariants: first transition (no from), transition (from+to), duplicate sequence, out-of-order sequence, from-disagreement resync, unknown- deployment ignore, cold-start seeding, underflow saturation. Plus the 5 M3 tests from before — 13 aggregator tests total, all green.`bucket.watch_all_from_revision(0)` sends the JetStream consumer request with DeliverByStartSequence and an optional-missing start sequence, which the server rejects with error 10094: consumer delivery policy is deliver by start sequence, but optional start sequence is not set `watch_with_history(">")` uses DeliverPolicy::LastPerSubject instead — replays the current value of every key, then streams live updates. Same cold-start-plus-steady-state semantics, correct wire. Caught by smoke-a4 --auto: state watcher exited immediately on startup, no deployments ever reconciled.Kills the "CRD owns a list of device ids" smell. Deployment CR now carries a standard K8s LabelSelector; Device is a first-class cluster- scoped CR (like Node). Matching, desired-state KV writes, and status aggregation all run off selector evaluation against the Device cache — no list of device ids anywhere in the CRD spec. Cross-resource model: - Agent publishes DeviceInfo (with labels) to NATS `device-info` KV. - device_reconciler watches that bucket → server-side-applies a cluster-scoped Device CR with metadata.labels + spec.inventory. - Deployment controller is now just validation + finalizer cleanup. - fleet_aggregator watches Deployment CRs + Device CRs + device-state KV, maintains in-memory selector → target device sets, writes/deletes `desired-state.<device>.<deployment>` KV on match changes, patches `.status.aggregate` at 1 Hz with matchedDeviceCount + phase counters. Applied CRD shape verified on a live k3d cluster: kubectl get crd deployments.iot.nationtech.io -o json .spec.versions[0].schema.openAPIV3Schema.properties.spec → rollout / score / targetSelector (matchLabels + matchExpressions) .spec.versions[0].schema.openAPIV3Schema.properties.status.aggregate → matchedDeviceCount / succeeded / failed / pending / lastError kubectl get crd devices.iot.nationtech.io -o json .spec.scope = "Cluster" .spec.versions[0].schema.openAPIV3Schema.properties.spec → inventory (nullable, camelCased fields) Load-test run: DEVICES=20 GROUP_SIZES=10,5,5 DURATION=20 all 3 CRs hit expected matched=N / succeeded+failed+pending=N. Other changes: - k8s-openapi gets the `schemars` feature so LabelSelector derives JsonSchema. - InventorySnapshot uses `#[serde(rename_all = "camelCase")]` for consistency with the rest of the CRD schema. - agent publishes `device-id=<id>` as a default label so the example_iot_apply_deployment `--target-device <id>` shorthand works out-of-the-box (implemented as `--selector device-id=<id>`). - example_iot_apply_deployment gains `--selector key=value` repeatable flag. - load-test.sh explore banner exposes Device CR commands + new matchedDeviceCount column.Roadmap: - v0_1_plan.md Chapter 2: rewrite to describe the shipped selector + Device CRD model (matchedDeviceCount, LabelSelector, per-concern KV). Drop AgentStatus / observed_score_string / target_devices references. Update "State of the world" preamble to match 2026-04-23 reality. - chapter_4_aggregation_scale.md: SUPERSEDED banner at top with a clear what-was-kept vs. what-was-dropped summary. Original body preserved as decision-trail archaeology. Code review pass on the iot crates, behavior-preserving: - fleet_aggregator: owned_targets is now keyed by DeploymentName (matches the KV key space — globally unique, no namespace). The old DeploymentKey keying created an orphan-leak on operator restart: seed_owned_targets stashed entries under a sentinel namespace ("") that on_deployment_upsert never merged. Now seeding populates the map correctly so restart + selector change diffs properly. - fleet_aggregator: reuse the Client passed into run() for the patch_api instead of calling Client::try_default() a second time. - fleet_aggregator: delete _use_list_params / _use_deployment_spec placeholder scaffolding + unused ListParams / DeploymentSpec / ScorePayload imports. Inline one-liner serialize_score. - fleet_aggregator: clean up `then(|| ...)` → filter/map split. - device_reconciler: `is_label_value(v).then_some(()).is_some()` → plain `is_label_value(v)`. - crd: delete speculative DeviceStatus + DeviceCondition (no one writes to them; the comment in DeviceSpec documents where they'd land when a heartbeat-reflection reconciler shows up). - controller: compute `obj.name_any()` once in cleanup(). All 24 tests green. End-to-end load test (20 devices / 3 groups / 20s) PASS after the changes.Three production-path improvements bundled into one chart change, all verified end-to-end (helm lint + load-test pass): 1. Switch from `HelmResourceKind::from_serializable(...)` to the typed `HelmResourceKind::{Namespace, ServiceAccount, ClusterRole, ClusterRoleBinding, Crd}` variants added to the shared harmony helm module. Serialization output is byte-equivalent; IDE discoverability + type-safety go up. 2. Annotate both CRDs with `helm.sh/resource-policy: keep`. Without this, `helm uninstall iot-operator-v0` cascade-deletes the CRDs; the kube GC then deletes every Deployment CR and every Device CR; the operator finalizer fires on each deletion and wipes the `desired-state` KV; agents tear down every container. One typo on uninstall would be fleet-wide catastrophe. `keep` makes uninstall data-preserving and idempotent — wipe requires an explicit `kubectl delete crd …`. 3. Lock down the operator Pod's securityContext: - `runAsNonRoot: true` - `readOnlyRootFilesystem: true` - `allowPrivilegeEscalation: false` - `capabilities: drop [ALL]` - `seccompProfile: RuntimeDefault` Deliberately *no* `runAsUser` — OpenShift's `restricted-v2` SCC assigns namespace-specific UIDs and rejects fixed ones. The image's `USER 65532:65532` (Dockerfile) gives vanilla k8s a non-root UID; OpenShift's SCC overrides with its own. Same chart works on both without custom SCC bindings. Dockerfile adds `USER 65532:65532` — required for vanilla k8s to accept `runAsNonRoot: true` without a Pod-level `runAsUser`. 65532 is the distroless/chainguard `nonroot` convention; arbitrary but safe (no overlap with common system UIDs). Tests: 2 chart unit tests locking in the keep annotation + SC shape. End-to-end load test at 20 devices / 3 CRs: pod comes up clean under the restricted SC, all aggregates correct, zero operator warnings.Addresses the review point that NatsBasicScore was a parallel typed-k8s_openapi path — reinventing probes, resource shapes, pod anti-affinity, JetStream storage — instead of reusing what NatsK8sScore already does via the upstream nats/nats helm chart. Every shape the project will ever ship (supercluster, single node, TLS, gateway, leaf nodes) is expressible as values on that chart. Parallel resource construction was churn waiting to diverge. The shape now: HelmChartScore [existing helm-install primitive] ▲ │ pins chart + repo │ NatsHelmChartScore (new) [exposes values_yaml only] ▲ ▲ │ │ NatsBasicScore NatsK8sScore (single node) (supercluster + TLS + gateways) Changes: - Delete harmony/src/modules/nats/node.rs (279 lines of typed k8s_openapi Deployment/Service/Namespace — gone). - New harmony/src/modules/nats/helm_chart.rs: NatsHelmChartScore pins chart_name = "nats/nats" and its official repository; values_yaml is the only varying input. Implements Score<T> for any topology with HelmCommand; caller hands it to K8sBareTopology / HAClusterTopology / K8sAnywhereTopology. - Rewrite score_nats_basic.rs as a thin preset: build a minimal single-node values_yaml (fullnameOverride, replicaCount=1, cluster.enabled=false, jetstream on/off, service type via the chart's `service.merge.spec.type` knob, optional image override). 10 unit tests on render_values covering every builder combination + image-ref splitting. Score bound moves from `T: K8sclient` to `T: HelmCommand` since installation is now helm-based. - score_nats_k8s.rs: last step in deploy_nats switches from a hand-constructed HelmChartScore to NatsHelmChartScore::new(...). Supercluster values_yaml construction untouched — a supercluster is just a more elaborate values file against the same chart. - bare_topology.rs: add `impl HelmCommand for K8sBareTopology` so the in-load-test flow (K8sBareTopology → NatsBasicScore → NatsHelmChartScore → HelmChartScore) compiles. Returns a bare `helm` command; KUBECONFIG resolution mirrors how HAClusterTopology does it. - mod.rs: export NatsHelmChartScore + the re-shaped NatsServiceType. - load-test.sh: the nats/nats chart provisions a StatefulSet, not a Deployment. Wait on `pod -l app.kubernetes.io/name=nats` instead of `deployment/iot-nats` — works across workload kinds. Tests: - 2 helm_chart unit tests (chart+repo pinning, default install- upgrade semantics) - 10 score_nats_basic unit tests covering every values shape - Full load-test.sh e2e (20 devices / 3 CRs / 20s): PASS.The IoT vocabulary was anchoring the codebase to one customer's domain. The reconciler pattern is generic — operator in k8s, NATS KV as desired-state bus, agents reconciling podman / OKD / KVM / anything that can register. "Fleet" captures that neutrally; IoT stays acknowledged in docs as the first customer use case. Done now, while nothing is deployed. After a partner fleet lands, changing the CRD group alone is a multi-quarter migration. Scope (nothing left over): Paths + crates - iot/ → fleet/ - iot/iot-operator-v0 → fleet/harmony-fleet-operator - iot/iot-agent-v0 → fleet/harmony-fleet-agent - harmony/src/modules/iot → harmony/src/modules/fleet - ROADMAP/iot_platform → ROADMAP/fleet_platform - examples/iot_{vm_setup, load_test, nats_install} → examples/fleet_* - -v0 suffix dropped on the operator + agent crates (semver in Cargo.toml already tracks version) Rust identifiers - enum IotScore (podman score payload) → ReconcileScore - struct IotDeviceSetupScore/Config → FleetDeviceSetupScore/Config - InterpretName::IotDeviceSetup → InterpretName::FleetDeviceSetup - HarmonyIotPool → HarmonyFleetPool (libvirt pool) - HARMONY_IOT_POOL_NAME (default "harmony-iot") → HARMONY_FLEET_POOL_NAME ("harmony-fleet") - IotSshKeypair → FleetSshKeypair - ensure_iot_ssh_keypair / ensure_harmony_iot_pool / check_iot_smoke_preflight_for_arch → fleet-prefixed variants Wire / config surfaces - CRD group `iot.nationtech.io` → `fleet.nationtech.io` - Finalizer `iot.nationtech.io/finalizer` → `fleet.nationtech.io/finalizer` - Shortnames iotdep/iotdevice → fleetdep/fleetdev - Env var IOT_AGENT_CONFIG → FLEET_AGENT_CONFIG - Env var IOT_VM_ADMIN_PASSWORD → FLEET_VM_ADMIN_PASSWORD - Binary /usr/local/bin/iot-agent → /usr/local/bin/fleet-agent - Systemd user `iot-agent` → `fleet-agent` - VM admin user `iot-admin` → `fleet-admin` Defaults - Namespaces iot-system/iot-demo/iot-load → fleet-system/fleet-demo/fleet-load - Helm release iot-nats → fleet-nats - Helm release iot-operator-v0 → harmony-fleet-operator - Container image localhost/iot-operator-v0:latest → localhost/harmony-fleet-operator:latest - On-disk cache $HARMONY_DATA_DIR/iot/ → $HARMONY_DATA_DIR/fleet/ (cloud-images, ssh keypairs, libvirt pool) What stayed - harmony-reconciler-contracts — already neutrally named - Wire types (DeviceInfo, DeploymentState, HeartbeatPayload, DeploymentName) — already neutral - KV buckets (device-info, device-state, device-heartbeat, desired-state) — already neutral - CRD kind names (Deployment, Device) — already neutral - NatsBasicScore / NatsHelmChartScore / HelmChart / etc. — framework-scope, unchanged Verification - cargo check --workspace --all-targets: clean - All harmony lib tests (114), fleet-operator (6), fleet-agent (7), harmony-reconciler-contracts (13): green - End-to-end load-test (20 devices / 3 CRs / 20s under fleet/scripts/load-test.sh): PASS. Image built as localhost/harmony-fleet-operator:latest, chart installed as release harmony-fleet-operator in namespace fleet-system, all CR aggregates correct. Zero stragglers: grep across the tree for \biot\b / IOT_ / \bIot[A-Z] returns empty (excluding docs explicitly talking about IoT as the first customer's domain).New first step (1/7): read /etc/fleet-agent/config.toml off the device and compare against the rendered desired config. Three branches: - missing → info, first install - matches → warn, converge anyway - differs → warn + unified diff (similar::TextDiff with 2-line context radius, '-/+' marker style) + inquire::Confirm prompt defaulting to N. Aborts with InterpretError if declined. Existing 6 steps renumbered to 2/7-7/7. The diff replaces the previous "dump both full configs" approach which was unreadable even for one-line differences. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>ZitadelScore: - Auto-provisions an `iam-admin-pat` Kubernetes secret via the chart's FirstInstance.Org.Machine.Pat block. ZitadelSetupScore depended on this secret existing; without the chart values, the prior code path was non-functional. - New `external_port: Option<u32>` field. Controls Zitadel's emitted issuer URL when the host port mapping isn't 80/443 (k3d typically maps 8080:80). Without it, JWT-bearer audience validation 500s with `Errors.Internal` because the assertion's `aud` doesn't match the chart-default issuer at port 80. ZitadelSetupScore is extended for the JWT-bearer flow needed by the NATS auth callout: - API apps (resource servers — required for project-id audience scope) - Project roles (`POST .../projects/{id}/roles`, idempotent) - Machine users with KEY_TYPE_JSON keys (provisioned + cached device-side; Zitadel does not expose the key material on subsequent reads, so the local cache is the source of truth) - User grants (project + role keys) Cache (ZitadelClientConfig) gains projects, machine_user_ids, machine_keys, and user_grants — keyed for idempotency across re-runs. Backwards compatible with existing harmony_sso example: the new fields have `#[serde(default)]` and prior callers just need empty vecs. Refresh upgrade-by-default in helm chart (separate commit) lets ExternalPort changes propagate to existing releases on re-run.harmony-nats-callout becomes a deployable service, not just a library: - New [[bin]] target with env+secret-file driven config and SIGINT/SIGTERM-aware shutdown. - Dockerfile (single-stage archlinux:base, non-root, matches harmony-fleet-operator convention). - Refactored handler into a pure `decide()` function so the entire authorization decision tree is unit-testable without async-nats. - New `roles` module with role resolution + a `validate_device_id` security gate that rejects NATS subject metacharacters in device_id (.>* whitespace) — closes a real escalation path through the `{device_id}` placeholder in the per-device permissions block. - Configurable role claim path + admin/device role names; admin wins when both are present (privilege-escalation invariant). 57 unit tests cover every reachable branch of the security decision tree; 4 e2e tests in nats/integration-test-callout exercise real NATS in podman with: device pubsub on own subjects, cross-device subject isolation, admin-can-read-anything, and JWT-without-role rejection. harmony/src/modules/nats_auth_callout/: - New `NatsAuthCalloutScore` deploys the callout as a K8s Deployment + Secret. fsGroup + 0o440 secret mode so the non-root container can read its mounted seed/password without leaving them in env vars. - `render_auth_callout_block` helper produces the YAML for NATS Helm `config.merge.authorization.auth_callout` so both halves stay in sync. examples/fleet_auth_callout/: - `bring_up_stack()` orchestrates k3d -> Zitadel + Postgres -> CoreDNS rewrite -> project + roles + machine users with JWT keys -> NATS Helm with auth_callout block -> callout image build + sideload -> NatsAuthCalloutScore deploy. Idempotent across re-runs (issuer NKey persisted in a K8s secret so user JWTs survive restarts). - `mint_access_token()` RFC 7523 JWT-bearer client. Uses Host header with port so Zitadel emits a matching issuer. - main.rs prints URLs/creds/keyIds and waits for Ctrl-C. - Three #[tokio::test] functions sharing one cluster via OnceCell: admin_can_read_any_device_subject, device_can_only_access_own_subjects, unknown_role_is_rejected. All green on real k3d.The fleet agent's NATS connection is the load-bearing piece of the "never lose connectivity to a device" guarantee. This commit makes that hold even when Zitadel access tokens expire across NATS pod restarts and network partitions. New `[credentials]` config variants (externally-tagged): type = "toml-shared" { nats_user, nats_pass } # v0/dev type = "zitadel-jwt" { key_path, oidc_issuer_url, audience, ... } A `CredentialSource` enum dispatches per variant: - TomlShared returns the same user/pass each call. - ZitadelJwt mints an access token from Zitadel via the JWT-bearer flow (RFC 7523). The keyfile at `key_path` is the only durable secret on the device; the bearer token is short-lived and refreshed in-memory when the cached value is within 5 minutes of expiry. Two concurrent refreshes are race-safe — the second writer's mint is wasted but produces a correct token. The agent's `connect_nats` is rewritten on top of async-nats's `with_auth_callback`, which is invoked on every (re)connect attempt: - async-nats reconnects automatically on disconnect (default behaviour of ConnectOptions) — we don't need a watchdog. - Each reconnect attempt invokes the callback, which calls `next_credential()`. If the cached token is expired, a fresh one is minted before the reconnect proceeds. So a Pi that loses NATS while its token has just expired will pick up a brand-new token on the next reconnect attempt with no operator intervention. - An `event_callback` surfaces Connected / Disconnected / SlowConsumer / ServerError events into tracing — operators can see exactly when reconnects happen, which is non-negotiable for an out-of-warranty device fleet. A subtle constraint drove the trait shape: async-nats's `with_auth_callback` requires the returned future to be `Send + Sync`, which `#[async_trait]`'s erased `Pin<Box<dyn Future + Send>>` does not satisfy. The credential source is therefore an enum (concrete dispatch) rather than `dyn CredentialSource`. Two variants is small enough that enum dispatch beats trait-object plumbing. Out of scope, tracked for follow-up: a separate daemon for SSH access to the Pi via Tailscale/Headscale ("secure backdoor"), and the device-join-request + admin-approve flow that would replace the current admin-PAT bootstrap pattern.The Pi onboarding flow can now mint a per-device Zitadel machine user on the operator's machine and ship the resulting JWT key to the Pi — the agent then authenticates to NATS via JWT-bearer instead of shared nats_user/nats_pass. `FleetDeviceSetupConfig.auth: FleetDeviceAuth` replaces the previous flat `nats_user` / `nats_pass` fields. Two variants: - TomlShared { nats_user, nats_pass } — legacy / dev fallback. - ZitadelJwt { machine_key_json, oidc_issuer_url, audience, ... } — per-device JWT-bearer. The Score: * Drops `machine_key_json` to /etc/fleet-agent/zitadel-key.json (mode 0640, owner fleet-agent — matches the agent's secret-mount conventions). * Renders [credentials] type = "zitadel-jwt" pointing at that keyfile + the issuer + audience the agent's CredentialSource needs. A change to either the keyfile content or the TOML triggers an agent restart, same as binary / unit drift. `fleet_rpi_setup --bootstrap-token <PAT>` activates the Zitadel path. The bootstrap PAT is held in the CLI's memory only; it never lands on the Pi. New flags: --zitadel-issuer-url, --zitadel-project-id, --zitadel-device-role (default `device`), --danger-accept-invalid-certs. `zitadel_bootstrap` is a slim ManagementAPI client that, idempotently per device: 1. Find-or-create machine user `device-${device_id}`. 2. Find-or-skip a project role grant (defaults to `device`). 3. Always mint a fresh JSON key and return its content. (Zitadel doesn't expose the private half of an existing key, so reusing isn't possible — stale keys remain valid until expiry, which is fine because each setup run overwrites the on-device keyfile.) Three new render_toml tests cover the zitadel-jwt path; eleven existing agent tests still pass. Out of scope, tracked: device-join-request + admin-approve flow that would replace bootstrap-PAT entirely (closer to the OKD node-approval pattern). Long-lived admin PAT is acceptable for the demo per product call.Adds `examples/fleet_staging_deploy/` — the operator-side, run-once- per-customer harness that brings up the fleet platform's central services on a real OKD/K8s cluster. Complements the existing `fleet_auth_callout` (k3d local-dev harness, kept unchanged) and `fleet_rpi_setup` (per-device onboarding). `FleetDomainConfig` is the single source of truth for hostnames: base_domain = "customer1.nationtech.io" → zitadel.<base> (Zitadel HTTPS via OKD HAProxy edge-TLS) → nats.<base> (NATS WSS through the same ingress) Nothing is hardcoded; the operator supplies one --base-domain flag and the deploy is fully parameterized. Re-running is idempotent (rides the helm-upgrade-by-default + ZitadelSetupScore search-then- create + persisted issuer-NKey-secret idempotency layers). NATS values render under config.merge.{auth_callout, accounts, system_account}, with WSS via `websocket: { enabled, port: 8443, ingress: { className: openshift-default, ... } }` and the OKD-flavored HAProxy edge-TLS annotations: route.openshift.io/termination: edge haproxy.router.openshift.io/timeout: "1h" (Switch to `reencrypt` when the customer wants pod-to-edge TLS; gateway-api migration is on their roadmap, separate from the demo.) bring_up_staging(): - Deploys ZitadelScore (external_secure: true, no external_port → 443). - Waits for HTTPS .well-known. - Provisions the project + API app + roles via ZitadelSetupScore hitting Zitadel through the public ingress (port 443, TLS verified). No machine users provisioned — fleet_rpi_setup mints them on demand per device, so the staging deploy stays device-count-agnostic. - Persists / reads the issuer NKey seed in the `callout-issuer-seed` K8s secret (so re-runs don't invalidate user JWTs already in flight on customer Pis). - Deploys NATS via NatsHelmChartScore with the WSS values. - Deploys NatsAuthCalloutScore (oidc_audience = project_id; external_secure path means no danger_accept_invalid_certs). main.rs ends by printing the exact `cargo run -p example-fleet-rpi-setup ...` invocation the operator runs against a Pi, with the project_id and zitadel/nats URLs filled in. Three unit tests cover the domain config + NATS values rendering (WSS + edge-TLS annotations + auth_callout under merge).Two bugs surfaced when the agent went live against NATS JetStream KV in the VM-based e2e rehearsal: 1. The default `device` role only allowed flat `device-state.<id>` / `device-commands.<id>` subjects. The agent's actual data plane is JetStream KV, which puts every operation on `$KV.<bucket>.<key>` subjects with control-plane traffic on `$JS.API.>` and `$JS.ACK.>`. With the old role config, the very first KV publish died with `Permissions Violation for Publish to "$JS.API.INFO"`. The role now allows `$JS.API.>` + `$JS.ACK.>` plus the four per-device data subjects derived from harmony_reconciler_contracts::kv (info.<id>, state.<id>.<dep>, heartbeat.<id>, desired-state.<id>.<dep>). The legacy direct `device-state.<id>` / `device-commands.<id>` subjects are kept so non-JetStream callers of NatsAuthCalloutScore still work. A new unit test (`device_role_covers_reconciler_contract_kv_subjects`) imports the contract crate as a dev-dep and asserts each contract- produced subject is matched, plus that cross-device subjects are *not* matched. This locks the role config to the contract surface so future renames break the test before they break prod. 2. Zitadel's `client_id` claim for a machine user equals the userName verbatim. Both `fleet_rpi_setup` and `fleet_e2e_demo` create the user as `device-{device_id}`, so the JWT carries `device-vm-device-00` while the agent's KV keys use the bare `vm-device-00`. The callout was interpolating the prefixed string into permissions, producing rules that never matched what the agent actually publishes. Adds `device_id_prefix_strip` (env: `DEVICE_ID_PREFIX_STRIP`, defaults empty so existing deployments are unaffected). When set, the validator strips the prefix from the extracted claim before permission interpolation. The fleet_auth_callout example wires it to `device-` so the e2e harness stays end-to-end correct without reaching into either naming convention. Verified end-to-end: both VM agents now publish DeviceInfo / heartbeat through JetStream KV with no permission errors and zero service restarts since the rollout.The agent's data plane was JetStream-KV-only, so live observers that don't want to consume the JS stream had no signal to subscribe to. The walking-skeleton e2e admin test was failing as a result — admin subscribes to `device-state.>` (the per-device direct subject) and saw nothing in 30s. This commit adds a small core-NATS publish on `device-state.<id>` alongside the existing KV writes: - `FleetPublisher::publish_state_pulse()` emits a tiny `{device_id, kind: "heartbeat", at}` payload on `device-state.<device_id>`, called from the heartbeat loop so observers see traffic on the same 30s cadence as the KV heartbeat write — but on a non-JetStream subject anyone can sub to. - `write_deployment_state()` now fans out the same payload it puts in the KV bucket on the direct subject, so live admin tooling picks up reconcile transitions immediately without watching the KV stream. Also threads `device_id_prefix_strip = "device-"` through the fleet_e2e_demo bring-up. The bring-up has its own NatsAuthCalloutScore construction (parallel to fleet_auth_callout's `bring_up_stack`), and was missing the prefix-strip line, so the deployed callout was interpolating permissions against `device-vm-device-00` instead of the bare device id the agent uses. Locks the regression with a unit test (`device_id_prefix_strip_lands_as_env_value`) on the deployment manifest builder. Verified end-to-end in the VM rehearsal: test both_devices_heartbeat_within_60s ... ok test admin_jwt_reads_any_device_subject ... okBumps coverage on harmony-fleet-auth from 5 to 18 unit tests. The new tests lock the corners we burned cycles on while debugging the live system: * cache freshness boundary (within-leeway, outside-leeway, no-cache, non-zitadel variant) * assertion claim shape (iss/sub/aud/exp/iat) and the 60-second lifetime constant Zitadel enforces server-side * scope string content (plural-projects-roles + singular-project-id URN + openid base) * token URL strips trailing slashes (the //oauth/v2/token 404 waiting to bite the next operator) * MachineKeyFile JSON parsing under Zitadel's wire shape Refactor: build_assertion now delegates to build_assertion_claims + build_assertion_header (pure, no signing). Lets the claim/header shape be unit-tested without an RSA private-key fixture; the sign-and-decode end-to-end is still covered by the e2e harness. No new deps. wiremock not needed — every meaningful assertion is on pure logic.Removes the hand-typed ScorePayload struct and its custom schemars schema function. DeploymentSpec.score is now typed as the strongly typed ReconcileScore enum already used by the agent, eliminating duplication and ensuring the CRD schema is derived automatically. - Add JsonSchema derive to PodmanService, PodmanV0Score, ReconcileScore - Enable podman feature on harmony dependency in operator - Re-export ReconcileScore/PodmanV0Score/PodmanService from crd module - Update harmony_apply_deployment and fleet_load_test examples - Remove TODO comment from harmony_apply_deployment Wire format is unchanged (externally tagged {type, data}), so the operator -> NATS KV -> agent path remains fully backward compatible.Collapses the load-test harness's chart-gen + helm-install dance into first-class Harmony Scores. Customer-facing path: let score = FleetServerScore::new(nats, operator); score.create_interpret().execute(&Inventory::empty(), &topology).await?; FleetOperatorScore renders the operator chart (CRDs + RBAC + ServiceAccount + Deployment) into a tempdir and delegates to HelmChartScore. FleetServerScore composes it with NatsBasicScore via fail-fast `?` chaining; Zitadel + Argo hang off the same chain when their Scores land. Structural change: CRD type definitions and chart-builder moved from fleet/harmony-fleet-operator/src/{crd,chart}.rs into harmony/src/modules/fleet/operator/. Harmony can't depend on the operator crate (cycle), so the score-side code lives in harmony and the operator binary imports the types right back via `harmony::modules::fleet::operator::*`. Considered keeping CRDs in the operator crate with the score either there or in a sibling crate, but putting customer-facing scores in harmony/src/modules/fleet/ matches the existing convention (FleetDeviceSetupScore, ProvisionVmScore) and keeps the CRDs reachable from future harmony scores (e.g. an inventory aggregator reading Device CRs) without dragging in the operator binary. The operator's `chart` subcommand stays as a developer convenience (routes through harmony::modules::fleet::operator::build_chart) so `cargo run -p harmony-fleet-operator -- chart` still produces an identical chart on disk for inspection. Existing examples (fleet_load_test, harmony_apply_deployment) updated to import CRD types from harmony directly. load-test.sh phase 3c collapses to a single `cargo run -p example_fleet_server_install` invocation; phase 2b's NATS install still runs separately so the host-side NATS reachability probe sits where it always did. Idempotency: re-running short-circuits via HelmChartScore::find_installed_release on both inner installs. Verified: cargo fmt --check, cargo clippy, cargo test all pass; the 4 fleet operator unit tests (2 migrated from operator crate, 2 new on FleetOperatorScore defaults/builders) pass under `cargo test -p harmony`; operator chart subcommand produces an identical chart structure post-refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>run_server_install.sh now unconditionally sources examples/fleet_server_install/env.sh after computing REPO_ROOT, so the example's env knobs (KUBECONFIG, RUST_LOG, NO_ZITADEL, ZITADEL_HOST, …) are picked up without the user having to source manually before invoking the script. The script's `${VAR:-default}` block only fills in values env.sh leaves unset. env.sh keeps a (commented-out) KUBECONFIG hint and the new optional Zitadel knobs documented post-source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Extends NatsK8sScore additively (every new field optional, defaults preserve supercluster shape): pub gateway: Option<GatewayConfig> // None = single-instance pub auth_callout: Option<AuthCalloutCfg> // delegate auth to callout pub websocket: Option<WebSocketRouteCfg> // public WS Route + edge TLS Render-side: * `gateway = None` → cluster.enabled=false, replicas=1, gateway block disabled, no `tlsCA`, no service.ports.gateway * `auth_callout = Some` → emits authorization.auth_callout block (using harmony's existing render_auth_callout_block convention) + accounts.<account>.users for the bypass user the callout connects as + accounts.SYS + system_account: SYS. Drops the legacy testUser + default_permissions — the callout is the sole authority. * `websocket = Some` → enables config.websocket.enabled with no_tls (the Route owns TLS termination). Routes: * `gateway` Route stays gated to gateway.is_some(). passthrough on 7222, host = cluster.dns_name. Preserves supercluster behavior. * `websocket` Route is new. Edge-TLS termination on port 8080 (chart's WS listener), Redirect insecure-edge policy, host from WebSocketRouteCfg. cert-manager.io/cluster-issuer annotation drives the Route certificate. OKDRouteScore gains an `annotations: BTreeMap<String, String>` field (default empty) + `with_annotation()` builder so callers can attach the cert-manager annotation without reaching for K8sResourceScore manually. Side-effect: `harmony` lib's default features now include `podman`. The CRD types in `modules::fleet::operator::crd` embed `ReconcileScore` from `modules::podman` unconditionally — without the feature on by default, harmony's lib-only builds fail. Existing explicit `features = ["podman"]` callers are unaffected. K8sAnywhereTopology's `Nats::deploy` impl populates the new fields with `gateway = Some(default)` so the capability path keeps the supercluster behavior it had before this commit.Replaces the volume-mounted Secret (`/etc/callout/{issuer-nkey-seed, nats-auth-pass}`) with `valueFrom.secretKeyRef` env vars (`ISSUER_NKEY_SEED`, `NATS_AUTH_PASS`). The callout binary's `read_secret` helper already supports both `<NAME>_FILE` and `<NAME>` — it just falls through to env when the `_FILE` variant is absent. Also drops the pod-level `securityContext` block that pinned `runAsUser: 65532, runAsGroup: 65532, fsGroup: 65532`. OKD's restricted-v2 SCC rejects pods that pin UID/GID outside the namespace's allocated range; the SCC will assign appropriate values from that range when the fields are unset. Container-level hardening (runAsNonRoot, no-privilege-escalation, RO root fs, capabilities drop ALL) stays intact. Tests rewritten to assert the new shape: env vars come from Secret key refs, no volumes, no pinned UID/GID/fsGroup. 7 callout tests green.`FleetServerScore` now composes: * `nats: NatsK8sScore` — replaces NatsBasicScore. Same Score that knows about OKD Routes, the auth_callout block in NATS Helm values, and the WS edge-TLS wiring. The NatsBasicScore-using `fleet_server_install` example registers the simple inner Scores directly (no FleetServerScore wrapper) — keeps the basic k3d-style install working without forcing it through the K8s-flavor Score. * `identity_setup: Option<ZitadelSetupScore>` — runs after the Zitadel helm install. Provisions project + roles + machine users via Zitadel's management API. The keys it produces are what the operator authenticates with. * `auth_callout: Option<NatsAuthCalloutScore>` — deploys the callout pod. Pair with `nats.auth_callout = Some(...)` so the rendered NATS values delegate to the same issuer pubkey. Execute order: identity (helm) → identity_setup (API) → nats (with auth_callout block in values) → auth_callout (pod) → operator The operator goes last so it doesn't burn reconnect attempts while the rest comes up; its `connect_with_retry` covers any small remaining race. Trait bounds widen to include `Nats + TlsRouter` (for NatsK8sScore's Route + capability path). Post-install summary lines added: NATS WS public URL when set, and a kubectl pointer to the callout deployment.ZitadelScore gains two fields, both with defaults that preserve the previous hardcoded behavior: pub namespace: String // default "zitadel" pub cluster_issuer: String // default "letsencrypt-prod" The hardcoded `NAMESPACE` const becomes `pub const DEFAULT_NAMESPACE` and the YAML's `cert-manager.io/cluster-issuer` annotation now substitutes `{cluster_issuer}` from the field. Existing struct-literal ZitadelScore call sites (5 examples) updated to fall through to `..Default::default()` so older callers compile unchanged. New example: `examples/fleet_staging_install`. One-shot install of the fleet stack on OKD-shaped clusters, composing in order: 1. ZitadelScore (helm) into `--zitadel-namespace` 2. ZitadelSetupScore (project + roles + fleet-ops + fleet-operator machine users) 3. NatsK8sScore: single-instance + auth_callout + WS Route 4. NatsAuthCalloutScore: env-var-only Secret config 5. FleetOperatorScore: credentials TOML inlining the operator's JSON keyfile via key_json (no volume mounts) Public hostnames derive from one CLI flag: `--base-domain`. The demo uses `cb1.nationtech.io` → sso-staging.cb1.nationtech.io and nats-fleet-staging.cb1.nationtech.io. cert-manager `--cluster-issuer` defaults to `letsencrypt-prod`. Image refs (`--operator-image`, `--callout-image`) are required (private registry, no sensible default). Generates the issuer NKey + auth pass at install time; the callout's Secret consumes them via env-from-secret-key. One TOML file end-to- end: the operator pod's only mounted Secret is the credentials TOML, single-key, no volumes. Idempotency note: re-running ZitadelSetupScore with the same project name short-circuits via the cached client-config. Re-runs of NATS / operator / callout are idempotent at the Helm/K8sResourceScore level.`PodmanService.env: Vec<(String, String)>` made schemars emit `items: [{type: string}, {type: string}]` (OpenAPI tuple validation), which k8s apiextensions rejects with "Forbidden: items must be a schema object and not an array" — install of the operator's `deployments.fleet.nationtech.io` CRD blew up at the Helm step. Introduces `EnvVar { name, value }` in `domain::topology` (with `From<(String,String)>` for ergonomics) and switches both `PodmanService.env` and `ContainerSpec.env` to `Vec<EnvVar>`. schemars now produces `items: { type: object, properties: { name, value } }` which validates cleanly. Adds `env_schema_is_object_not_tuple_for_crd_validation` to lock the schema shape — if anyone reverts to a tuple the test fails before the operator install does.`FleetDeviceSetupScore` gains `FleetDeviceAuth::ZitadelEnroll` — resolves the device's Zitadel machine user + JSON key inline, then falls through to the existing keyfile-drop flow exactly as if a pre-resolved `ZitadelJwt` had been passed. Two operator workflows fall out of this: * Dev-on-device — developer runs the score on a Pi with display attached, browser opens locally to Zitadel SSO, dev signs in with their personal account (must hold IAM_OWNER or equivalent), score mints credentials for that one device and brings up the agent. * Production-via-SSH — operator runs from a workstation, targets each device over SSH. Browser opens once on the workstation; the resulting access token is in-memory only for v0 (per-batch token caching tracked in ROADMAP/fleet_platform/device_enrollment_token_caching.md). Implementation: * `harmony/src/modules/zitadel/admin_auth.rs` — RFC 8628 device-code flow against Zitadel. Tries `webbrowser::open`, falls back to printing the URL (SSH sessions just see the URL). Minimum scope set is `openid urn:zitadel:iam:org:project🆔zitadel:aud` — enough to call `/management/v1/*`, nothing more. * `harmony/src/modules/zitadel/setup.rs` — `mint_device_credentials` helper that reuses the existing find-or-create methods (project, machine user, user grant) plus `create_machine_key`. Idempotent on user + grant; always mints a new key because Zitadel does not return existing key material. * `harmony/src/modules/fleet/setup_score.rs` — new `ZitadelEnroll` variant + `AdminAuth::{Sso, Token}`. Resolution runs at the top of execute(); the rest of the score sees a single shape. render_toml's match collapses both Zitadel variants into one arm (they share the issuer/audience/danger fields). * `harmony/src/modules/fleet/assets.rs` — Debian bookworm arm64 generic-cloud image fetcher. This is the same Debian base Raspberry Pi OS is built on; Pi OS itself is locked to Pi hardware (Broadcom firmware) and won't boot in generic KVM. No sha pin (Debian's `latest/` URL rotates per point release); swap to a dated subdir if you need cryptographic provenance. * `examples/fleet_device_enroll/` — single CLI covering both workflows + a `--launch-pi-vm` switch that boots a Pi-equivalent VM with one command and prints the SSH details + suggested follow-up enrollment command. README walks the three flows. Tests: `render_toml_zitadel_enroll_renders_same_as_zitadel_jwt` locks the byte-equivalence between the unresolved (Enroll) and resolved (Jwt) variants — the invariant `execute()` relies on so TOML rendering is independent of when admin auth resolves. Adds `webbrowser` as a regular dependency on `harmony` (small, no feature gate).`harmony`'s `kvm` feature pulls in `libvirt`, which doesn't link on aarch64-unknown-linux-gnu (no aarch64 `libvirt-dev` package on most distros). The device-side workflow needs a binary that runs ON the Pi and only does enrollment — no VM-rehearsal — but the example was unconditionally enabling `kvm`, so the cross-compile failed at link time with `undefined reference to virStoragePoolFree` etc. Fixes by gating the rehearsal bits behind a new `vm-rehearsal` Cargo feature (default-on for workstation builds, opt-out via `--no-default-features` for device builds): * `Cargo.toml`: harmony dep is now `default-features = false, features = ["podman"]` (podman is needed unconditionally — the operator CRD types depend on it). New `vm-rehearsal` feature enables `harmony/kvm` on demand. * `main.rs`: every libvirt-touching import, CLI flag (`--launch-pi-vm`, `--vm-rehearsal`, `--vm-*`), CLI branch, and helper function (`boot_*_vm`, `RehearsalImage`) is now `#[cfg(feature = "vm-rehearsal")]`. With the feature off, none of it is referenced and nothing tries to link libvirt. * README: documents both build flavors with copy-paste commands. Workstation build (unchanged): cargo build --release -p example_fleet_device_enroll Device-side build (the new path): cargo build --release --target aarch64-unknown-linux-gnu \ -p example_fleet_device_enroll --no-default-featuresTwo related issues from a real run. (1) Image was Debian 12 bookworm — released June 2023, glibc 2.36, two releases old by mid-2026. Bumping to Debian 13 trixie (current stable since Aug 2025, glibc 2.41) keeps the rehearsal kernel + userland roughly aligned with what's likely sitting on a fresh Pi imaged today. URL pattern is unchanged (`cloud.debian.org/.../latest/`), still no sha pin (latest/ rotates per point release; swap to a dated subdir if cryptographic provenance matters). The `cdrom` is still attached as virtio-blk read-only — that fix is independent and still required (Debian's cloud-arm64 kernel ships without ahci.ko). Renames in `harmony::modules::fleet`: ensure_debian_bookworm_arm64_cloud_image → ensure_debian_trixie_arm64_cloud_image DEBIAN_BOOKWORM_CLOUDIMG_ARM64_{URL,FILENAME} → DEBIAN_TRIXIE_CLOUDIMG_ARM64_{URL,FILENAME} (2) The device-side `--target aarch64-unknown-linux-gnu` cross-compile produced a binary that linked against the workstation's glibc (2.41 on a current Arch host). Running it on the rehearsal VM (Debian 12 / 13) blew up immediately: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.39' not found This is fundamental to the gnu target — the binary depends dynamically on whatever glibc the host happens to have. The fix isn't a workaround on the harmony side; it's switching the device build to `aarch64-unknown-linux-musl`, which produces a fully-static binary that runs on any aarch64 Linux regardless of the device's libc generation. README updated with the musl recipe (rustup target, cargo config linker, optional `cross` shortcut) and the rationale for why musl beats gnu for device-side cross-compiles. Workstation build is unchanged.The SSO login from `fleet_device_enroll` was hitting Zitadel with the app name (`harmony-cli`) as the OAuth client_id, getting back: 400 Bad Request: invalid_client: no active client not found Two real problems behind that error: * `fleet_staging_install` never created the device-code OIDC app in the first place. Its `applications: vec![]` was empty — the only Zitadel resources provisioned were the API app, the project roles, and the machine users. The `harmony-cli` device-code app that the enrollment example assumed was provisioned simply did not exist. Adds it via `ZitadelApplication { app_type: DeviceCode }` so a fresh staging install yields a real OIDC app. * `--admin-oidc-client-id` defaulted to the literal string `"harmony-cli"`, which is the app's *display name*, not the client_id. Zitadel issues numeric client_ids of the form `<number>@<project>` when the app is created — that's what OAuth endpoints want. Defaulting to the name was misleading: it produces no warning, just a confusing 400 from Zitadel about a "client not found" that the operator can't easily map back to "wrong field passed to the flag". Removes the default; the flag is now required when SSO is in use (skipped only with `--admin-token`). Help text and README spell out the distinction explicitly. The staging install now reads the resolved client_id from `ZitadelClientConfig::client_id(...)` and prints it in the success banner, alongside a copy-paste-ready `fleet_device_enroll` invocation. README also documents the post-install lookup path (`jq -r '.apps."harmony-cli"' ~/.local/share/harmony/zitadel/client-config.json`) and adds the `invalid_client` error to the troubleshooting list.Two ergonomic fixes for the dev-on-device workflow. (1) Ansible local connection. `LinuxHostTopology` always went through SSH, so running `fleet_device_enroll` with `--target ssh://you@127.0.0.1` required the operator to set up sshd loopback access on their own Pi — clunky for a dev who's sitting in front of the device. Adds `LinuxLocalhostTopology` that drives the same `LinuxHostConfiguration` trait surface using ansible's `-c local` connection (no SSH at all) plus direct `sh -c` subprocess calls for the loginctl / systemctl --user paths. The configurator now takes a unified `AnsibleConnection<'a>` enum (`Ssh { host, creds }` | `Local { sudo_password }`) instead of a `(host, creds)` pair. Internal `host_exec`/`host_sudo_exec` helpers branch by transport and return the same `SshCommandOutput` shape either way, so the public methods (ping, ensure_package, ensure_file, etc.) are transport-agnostic. `fleet_device_enroll` switches `--target` to optional: omitted → local, present → SSH. No magic `localhost` string, no special-case for 127.0.0.1. README + the flag's help text describe both modes. (2) Auto-install `python3-venv` on Debian. First-run venv creation fails on stock Debian/Ubuntu with `ensurepip is not available` because Debian splits venv into the `python3-venv` apt package. `ensure_ansible_venv` now detects that failure, checks for `/etc/debian_version`, runs `sudo apt-get update && sudo apt-get install -y python3-venv`, and retries. Idempotent on re-runs (apt is a noop when already installed). On non-Debian or genuinely broken environments, the operator gets a clear error pointing at the right install command per distro family. Sudo prompts for a password if not configured passwordless — that's fine, the operator expects it.loginctl enable-linger0891798073info`EnvFilter::from_default_env()` returns the empty filter when `RUST_LOG` isn't set, which silences every log line. The systemd unit installed by `FleetDeviceSetupScore` does pass `RUST_LOG=info`, but a hand-launched binary, an overridden unit, or any other invocation path produced a silent agent — including the dev-on-device run the user just hit. Switches to `try_from_default_env().unwrap_or_else(|_| EnvFilter::new("info"))` so: * RUST_LOG unset → info-level by default (what the operator wants the moment they look for logs). * RUST_LOG set → respected as before (`RUST_LOG=debug` for troubleshooting, `RUST_LOG=warn` if it's too chatty, etc.). The systemd unit's existing `Environment=RUST_LOG=info` line is left in place — explicit + harmless, and lets a customer toggle the unit's verbosity without rebuilding the binary.NATS server-level `jetstream: { ... }` config doesn't extend to explicit accounts — each one has to opt in individually with `jetstream: enabled` (or a per-account quota object). The rendered values block declared `FLEET` and `SYS` accounts but never enabled JetStream on `FLEET`, so the operator's first call to create its desired-state KV bucket died immediately with: JetStream error: JetStream not enabled for account (code 503, error code 10039) Adds `jetstream: enabled` to the callout account block in `render_values_yaml`. SYS deliberately stays without it — system account doesn't host streams. Reference: https://docs.nats.io/nats-concepts/jetstream/account_jetstream Adds `auth_callout_account_has_jetstream_enabled` regression test that: * asserts `jetstream: enabled` appears under the callout account block in the rendered YAML; * defense-in-depth: asserts `jetstream:` does NOT appear under SYS, so a future regex slip can't silently flip system-account JetStream on.websocketsfeature for wss:// support--device-id(RFC1123) 6cbecee6e1Workspace warning count: 408 → 105. Three buckets cleared: * Auto-fixable (`cargo fix` + `cargo clippy --fix`): unused imports removed, unused variables prefixed with `_`, deprecated method calls updated. Applied across harmony, harmony-k8s, harmony-agent, harmony_inventory_agent, the fleet/ workspace, and ~15 examples. * Generated code (opnsense-api/src/generated/): 269 snake_case warnings + ~10 unreachable-pattern warnings come from CamelCase-preserving bindings to OPNsense's HAProxy/Caddy XML schemas. Scoped a single `#[allow(non_snake_case, unreachable_patterns)]` at `pub mod generated;` rather than fighting the codegen — renaming would break serde round-trips and the codegen would regenerate them anyway. * opnsense-codegen parser's defensive `let...else` guards on `XmlNode` (currently single-variant): file-level `#![allow(irrefutable_let_patterns)]` with a comment explaining why we keep the `else` arms (they re-arm if the IR grows a second variant). `harmony_inventory_agent::local_presence::{DiscoveryEvent, discover_agents}` re-exports were stripped twice by the auto-fix passes (consumers live in another crate, so the local crate looks "unused" to lint). Anchored with explicit `pub use` + an `#[allow(unused_imports)]` annotation noting why. All 151 harmony lib tests still pass. Remaining ~105 warnings are mostly real dead code in non-fleet modules + a handful of unused-imports/variables clippy couldn't auto-resolve; cleared in the next pass.Picks up where the auto-fix pass left off. Workspace warning count goes from 105 to 0 across `cargo build --workspace --all-targets`. Three categories of fixes: 1. Mechanical fixes the auto-pass couldn't handle (unused imports inside braced multi-name `use` statements, unused variables that needed an underscore prefix without breaking other references): batched via a small Python script, then 6 manual edits where the warning location and the actual identifier were on different lines. 2. Dead-code that's intentionally kept around for future wiring or debug visibility — `#[allow(dead_code)]` at the right scope: - 19 individual items (struct fields, methods, free functions, type aliases, enum variants), e.g. `default_namespace` / `default_cluster_issuer` in zitadel/mod.rs (used via serde defaults, opaque to rustc), `score` fields on the OKD bootstrap interpret types, `crd_exists` methods on the prometheus alerting scores, the `harmony_inventory_agent::local_presence::{DiscoveryEvent, discover_agents}` re-exports. - 5 module-level allows for files where most items are aspirational scaffolding (harmony_agent's replica workflow, opnsense-config dnsmasq, three opnsense-api examples). 3. Special cases that needed real fixes, not allows: - `opnsense-config-xml/src/data/haproxy.rs`: deprecated `rand::thread_rng` / `Rng::gen` updated to `rng()` / `random`. - `harmony_secret/src/lib.rs`: the `secrete2etest` integration test gate is now declared in Cargo.toml's `[lints.rust] unexpected_cfgs.check-cfg`; the gated test module is structured so its dead `TestSecret`/`TestUserMeta` types come along for the cfg ride and don't show up as unconditional dead code. - `harmony/src/modules/nats/score_nats_k8s.rs:241`: `K8sIngressScore { name: todo!(), ... }`'s unreachable expression annotated. - `harmony/src/domain/topology/k8s_anywhere/k8s_anywhere.rs:982`: wrap the dead-after-`return Ok(Noop)` branch in `#[allow(unreachable_code)] { ... }`. Behavior unchanged. - `examples/try_rust_webapp/Cargo.toml`: `autobins = false` so `src/main.rs` isn't auto-registered as both bin AND example. All 16 lib-test suites pass: 437 tests, 0 failed, 13 ignored. Ready for `-Dwarnings` in CI as a follow-up — the gate makes sense once we're sure no contributor's local builds slip warnings back in.Working document for the architectural redesign of the fleet platform before v0.1 ships to production. Captures four sections of research: §1 — Current state inventory. Markdown-bullet map of every public type, score, trait, and module across `harmony/modules/fleet/`, `harmony-reconciler-contracts`, and `fleet/harmony-fleet-*/`. Sorted by domain meaning (identity, desired state, observed state, setup, plumbing) rather than location, so the cross-cutting concerns become visible. Includes a text "diagram" of the dependency graph showing the two problematic edges: runtime crates importing CRD types from the framework crate (`harmony-fleet-operator` ← `harmony::modules::fleet::operator::crd` verified at `controller.rs:37`, `device_reconciler.rs:21`, `main.rs:9`) and the agent importing podman wire types from the framework crate (`harmony-fleet-agent` ← `harmony::modules::podman` verified at `main.rs:21-22`, `reconciler.rs:11`). §2 — Theory review. Pulls principles from JG's *Pour l'amour des compilateurs* talk (2026-04-30), its references (Crichton, Feldman, Maguire, Goedecke, Fowler), and harmony's own load-bearing ADRs (002 hexagonal, 003 infrastructure abstractions, 015 higher- order topologies, 016 agent + global mesh, 018 template hydration). Synthesizes eight design principles for the redesign — including Goedecke's guardrail that "type-driven" ≠ "type-everything" so we don't over-fit the cardinality argument. §3 — Ten concrete shape problems (P1–P10), framed as cardinality mismatches, leaky boundaries, and "is this resolved yet" branches rather than bugs. P1 is the placement issue JG flagged in code review; P2 is `FleetDeviceAuth`'s mixed resolved/unresolved states; P10 is the credential-shape staircase across operator workstation / operator pod / agent. §4 — Five design alternatives, each scored against P1–P10: A. Move + thin façade (conservative cleanup). B. Resolved-only at boundaries + capability traits (principled incremental). C. Dataflow reframe (events in, state out). D. Fleet as kube control plane, period (deliberately weird). E. Algebra of fleets (deliberately mathematical). A is too little, C/D/E are right-shape but wrong-timing for the 3-day window. B is the working recommendation, with explicit awareness that D is the v2.0 destination and the capability traits in B are the seam that lets us migrate without breaking callers. §5 sketches a concrete shape for B: new `harmony-fleet/` domain crate with no framework dependency, `harmony-fleet-adapters-*` crates for NATS/Zitadel/kube, the existing operator/agent/auth crates wire adapters together, the framework's `harmony::modules::fleet` collapses to a re-export module that goes away by v0.2. §6 — Five open questions for JG's review before locking the choice. §7 — explicit "spike one slice, then commit or back out" process so we don't lock the wrong shape. Not an ADR yet. The ADR happens after JG agrees on which alternative is the working hypothesis and the spike confirms the shape feels better in code than on paper.Adds OIDC login support to the harmony-fleet-operator web dashboard using Zitadel SSO. pkce was the recommended option for this since we don't need to hold on to any secret. We compute a value on server before sending the data to Zitadel who validates authenticity by recomputing the hash and comparing the two values. pkce Auth flow 1. User visits a protected dashboard route, like /devices. 2. If no valid harmony_fleet_session cookie exists, the app redirects to /login. 3. /login creates: - random state - random pkce_code_verifier - derived code_challenge = base64url(sha256(pkce_code_verifier)) 4. The app stores state and pkce_code_verifier in a temporary HTTP-only login-attempt cookie. 5. The browser is redirected to Zitadel’s authorize endpoint with: - client_id - redirect_uri - scope - state - code_challenge - code_challenge_method=S256 6. After SSO login, Zitadel redirects back to /auth/callback?code=...&state=.... 7. The callback handler: - parses the raw query into a strict success/failure enum - reads the temporary login-attempt cookie - validates returned state - exchanges code + pkce_code_verifier for tokens - validates the returned ID token using OIDC discovery/JWKS - creates a local harmony_fleet_session cookie - redirects to / 8. Protected routes validate the local dashboard session cookie on each request. 9. /logout clears the dashboard session cookie and redirects to /login. --- Auth middleware responses depending on request type: - normal browser request: redirect to /login - SSE request: 401 authentication required - HTMX request: 401 with HX-Redirect: /login (HTMX redirect is more idiomatic than through Axum for this) Reviewed-on: #284 Reviewed-by: johnride <jg@nationtech.io> Co-authored-by: Reda Tarzalt <tarzaltreda@gmail.com> Co-committed-by: Reda Tarzalt <tarzaltreda@gmail.com>The previous e2e harness handrolled k8s manifests in `stack.rs`, bypassing the Score-Topology-Interpret machinery harmony exists to provide. This commit: 1. **ADR-023** codifies the rules: deploy with Scores (not manifests), e2e uses the same Scores as production, one Score per component, deploy blocks on smoke-test success, deploy logic lives in `*-deploy` crates, topologies are compile-time, thiserror over anyhow. CLAUDE.md mirrors the principles. 2. **New `fleet/harmony-fleet-deploy` crate** is the canonical home for fleet-component Scores: - `FleetOperatorScore` + helm-chart generator + `install_crds` moved out of `harmony::modules::fleet::operator` (they should never have lived in `harmony` core). `FleetServerScore` (composite of NATS + operator + Zitadel + callout) moved too. - New `FleetNatsScore` (preset over `NatsHelmChartScore` with fleet's required values; v1 supports `UserPass` auth, callout mode reserved on the public API for PR 1.5). - New `FleetAgentScore` with `FleetAgentTarget::Pod`; `Vm` target is a future variant that absorbs `FleetDeviceSetupScore`. - `harmony-fleet-deploy` binary built on the existing `harmony_cli` crate — no new CLI scaffolding. 3. **Operator runtime binary trimmed**: `Install` and `Chart` subcommands removed; both jobs now belong to `harmony-fleet-deploy`. The runtime binary becomes leaner. 4. **E2E harness rewritten** as a thin Score composer: `harmony-fleet-e2e/src/stack.rs` deploys the stack via `FleetNatsScore` + `FleetAgentScore`. The inline NATS manifest factory and the bespoke agent Pod renderer are gone. - Bring-up runs once per test binary via `shared_stack` + `tokio::sync::OnceCell` (matches the `fleet_e2e_demo` pattern). - Stale `e2e-*` namespaces from prior runs get pruned at startup so the leaks the OnceCell creates don't compound. 5. **`thiserror` for the agent's `CommandServer`** — replaces the anyhow-based surface with typed `CommandError` / `CommandServerError`. 6. **Memory** captures eight load-bearing principles (saved to `~/.claude/projects/.../memory/`) so future sessions don't drift back into manifest-handrolling. Verified: `cargo test -p harmony-fleet-e2e --test ping` green end-to-end against k3d in 25s warm.