11 changed files with 294 additions and 174 deletions
--- a/ROADMAP/fleet_platform/pre_merge_checklist.md
+++ b/ROADMAP/fleet_platform/pre_merge_checklist.md
@@ -29,26 +29,33 @@ why the negative path is intentionally untested (inquire has no
 stdin mock; covering it would need a `Config` type with a manual
 non-prompting `InteractiveParseObj` impl — separate refactor).
-### 1.2 — Manual end-to-end verification per fleet component
+### 1.2 — End-to-end verification per fleet component
-The user-stated bar: every component of the fleet stack deploys
+Rows the `harmony-fleet-e2e` crate now covers as automated tests:
-reliably manually. Not yet a single automated suite. Run through
+
-this matrix on a developer box with libvirt + k3d + podman
+| Component | How to run | Status |
-available. Mark date + initials when each row passes.
+|---|---|---|
 | Pod-target agent + NATS in k3d | `HARMONY_FLEET_E2E=1 cargo test -p harmony-fleet-e2e --test ping` | ✓ automated |
 | ARM VM bring-up + agent (aarch64 cloud image, AAVMF firmware) | `HARMONY_FLEET_VM_E2E=1 cargo test -p harmony-fleet-e2e --test vm_ping` | ✓ automated |
 | x86 VM bring-up + agent (KVM, fast path) | `HARMONY_FLEET_VM_E2E=1 FLEET_E2E_VM_ARCH=x86_64 cargo test … --test vm_ping` | ✓ automated |
 | Device-setup over SSH (FleetDeviceSetupScore) | Exercised by every `vm_*` test bring-up | ✓ automated |
 | Ping (operator → agent over NATS request/reply) | Both `ping` (Pod) and `vm_ping` (VM) | ✓ automated |
 | Agent KV isolation (own filter only) | `vm_isolation` | ✓ automated |
 | Podman deployment lifecycle (deploy → upgrade → delete) | `vm_deploy_lifecycle` (+ `podman ps` ground-truth via SSH) | ✓ automated |
 Verified at least once each on the dev host (aarch64 ~7 min,
 x86_64 ~2.5 min); see `fleet/harmony-fleet-e2e/README.md` for
 copy-paste commands and the wall-clock breakdown.
 Rows still **manual** (no Rust automation yet — verify by hand
 before merge and record date + initials):
 | Component | How to deploy | What "works" looks like | Owner | Last verified |
 |---|---|---|---|---|
 | x86 VM (cloud-init Ubuntu) | `cargo run -p example_fleet_vm_setup` | `virsh list` shows running VM with SSH key trust | | |
 | ARM VM (aarch64 + AAVMF firmware) | `cargo run -p example_fleet_vm_setup --features aarch64` (or `fleet/scripts/smoke-a3-arm.sh`) | aarch64 VM boots, fleet-agent comes up on it | | |
 | Zitadel (full setup) | `cargo run -p example_fleet_staging_install -- --base-domain <…>` | Zitadel admin UI reachable, persisted admin password set, IAM PAT secret created | | |
 | NATS + auth callout | `cargo run -p example_fleet_auth_callout` (deploy phase) | NATS pod running on k3d; callout pod healthy; JWKS fetch logs visible | | |
 | Operator | `cargo run -p example_fleet_server_install` | Operator pod up, Deployment CRD registered, NATS KV buckets created | | |
 | Agent on x86 VM | follow `examples/fleet_e2e_demo/RUNBOOK.md` | Agent connects to NATS, publishes DeviceInfo to KV | | |
 | Agent on ARM VM | same + arm64 target | same | | |
 | Enrollment via Zitadel SSO | `cargo run -p example-fleet-sso-login` + `fleet-device-enroll --device-id …` | Device JWT minted, machine user provisioned, agent connects with bearer-token JWT | | |
 | Device-setup over SSH (FleetDeviceSetupScore) | from `examples/fleet_e2e_demo::apply_setup` flow | agent binary installed, systemd unit enabled, agent running | | |
 | Ping (operator → agent over NATS request/reply) | `HARMONY_FLEET_E2E=1 cargo test -p harmony-fleet-e2e --test ping` | green test, ping round-trip | | |
 | Podman deployment | apply a `Deployment` CRD with `PodmanV0Score` payload, watch agent reconcile | `podman ps` on the device shows the requested container | | |
 Outputs of each manual run go into a follow-up issue / PR
 description, not committed here — this matrix is the index, not
@@ -64,45 +71,48 @@ For each item below, the question is: **does the code on this
 branch honor the principle?**
 - **P1. Deploy with Scores, not handrolled manifests.**
-  - `fleet/harmony-fleet-e2e/src/stack.rs`: already cleaned in
+  - `fleet/harmony-fleet-e2e/src/stack.rs` + `vm/*` confirmed
-    the ADR-023 refactor. Re-confirm no `k8s_openapi::api::*`
+    handroll-free: only `*Score` types are composed; the only
-    structs survive in test/example code.
+    `k8s_openapi` use is the readiness-poll `Deployment` get
-  - `fleet/harmony-fleet-deploy/src/agent.rs`: builds
+    (cluster query, not a manifest build).
-    `Deployment` / `ConfigMap` / `Service` manually inside
+  - `fleet/harmony-fleet-deploy/src/agent.rs` still builds
-    `interpret`. **Technically** within ADR-023's letter (it's
+    `Deployment` / `ConfigMap` manually inside `interpret`. ADR-023
-    inside a Score's interpret body) but is the right
+    letter is honored (manifests are inside a Score's interpret
-    abstraction to compose `K8sResourceScore` instead?
+    body, not in test/CLI code), so accepted for this branch. A
-    *Flagged for review.*
+    future cleanup could compose `K8sResourceScore` instead —
    track in a follow-up issue, not a blocker.
 - **P2. E2E uses the same Scores as production.**
-  - `harmony-fleet-e2e` is the test of this. Confirm `stack.rs`
+  - ✓ verified by both Pod (`stack.rs`) and VM (`vm/*.rs`)
-    composes the same Scores as `example_fleet_server_install`.
+    harnesses — they compose `FleetNatsScore` + `FleetAgentScore`
    + `ProvisionVmScore` + `FleetDeviceSetupScore` exactly as
    `example_fleet_server_install` / `example_fleet_vm_setup` do.
 - **P3. One Score per deployable component.**
-  - `harmony/src/modules/fleet/setup_score.rs` is 1049 lines and
+  - `harmony/src/modules/fleet/setup_score.rs` (1049 lines) is a
-    composes Zitadel + NATS + callout + operator. ADR-023 says
+    *device-side composition* (podman + user + linger + config +
-    "composition is the user-facing primitive; don't build
+    systemd unit), not a multi-service deploy. Acceptable under
-    monolithic deploy-everything Scores." Confirm this file is a
+    P3; the file is on the deferred move-to-`*-deploy` list (§1.7
-    composition of primitives, not a megascore that bypasses
+    ADR-024 scope).
    them.
  - **The 3 open code review comments still apply** (see §3.1).
 - **P4. Deploy returns only after smoke-test success.**
-  - This is *not* enforced today — see §3.2. Track as known
+  - Not enforced framework-wide; see §3.2. The e2e harness now
-    debt, not a merge blocker (ADR-023 left it open).
+    has `VmStack::wait_until_ready` (ping retry until subscribed)
    as a per-test stand-in. Track as known debt, not a blocker.
 - **P5. Deploy logic lives in a `*-deploy` crate.**
-  - Confirm: `harmony-fleet-deploy` is the canonical home. The
+  - ✓ `harmony-fleet-deploy` is the canonical home. New
-    `harmony/src/modules/fleet/` directory should shrink, not
+    `companion/` module added there. The `harmony/src/modules/
-    grow, in follow-ups. ADR-024 proposes pulling more out.
+    fleet/` directory should still shrink — see §1.7.
 - **P6. Topologies compile-time, selected at runtime.**
-  - No `Box<dyn Topology>` plugin loaders introduced. Confirm
+  - ✓ `rg 'Box<dyn Topology'` clean across the new code.
    with `rg 'Box<dyn Topology'` on the new code.
 - **P7. Extend Scores with companions, not API changes.**
-  - Confirm no new methods were added to `Score` / `Interpret`
+  - ✓ first concrete companion landed:
-    traits.
+    `harmony-fleet-deploy::companion::AgentObservation` — derives
    the agent's KV watch scope from typed `AgentConfig` without
    touching `Score` / `Interpret`.
 - **P8. CLI hybrid, staged (B today, C later).**
-  - Confirm new binaries follow the `harmony-*` naming pattern
+  - ✓ `harmony-fleet-deploy` binary follows the naming pattern
-    and use `harmony_cli`.
+    and uses `harmony_cli`. No plugin discovery introduced.
 - **P9. thiserror everywhere, anyhow only at binary glue.**
-  - Confirm new library code uses `thiserror`. Scan for
+  - ✓ new code (`vm/*.rs`, `kv_admin.rs`, `companion/`) uses
-    `anyhow::Error` returns in non-`main.rs` files.
+    typed errors via `thiserror`. `anyhow` only at test glue.
 Capability-naming rules from `CLAUDE.md`:
@@ -137,28 +147,9 @@ properly.
 ### 1.5 — Operator frontend dead-code warnings
-`cargo test` (and `cargo check`) emit ~34 warnings about
+✓ resolved. `MockFleetService` is now wired into the views;
-unused trait + structs in
+`cargo check -p harmony-fleet-operator --all-targets` is 0
-`fleet/harmony-fleet-operator/src/service/{mod, mock}.rs`:
+warnings. The "(a) wire the trait into the views" path landed.
 `FleetService`, `DeviceDetail`, `DeploymentDetail`, etc. all
 marked `never used`. The maud+htmx frontend was committed as
 "initial commit, still much work to do." The views currently
 inline mock data instead of going through the `FleetService`
 trait.
 Decision needed before merge:
 - (a) Wire the trait into the views (real fix; preferred but
  more code).
 - (b) Add `#[allow(dead_code)]` at module level with a TODO that
  references this checklist.
 - (c) Delete the unused service abstraction and rebuild it when
  the views need real data.
 `cargo clippy` does not flag these — only `cargo check` does,
 because the dead-code lint emits during the bin compilation
 path, not the lib compilation path. So the warnings are
 real but easy to miss.
 ### 1.6 — Untracked items decision
@@ -255,35 +246,21 @@ For anyone landing on the PR cold:
 ## §3 — Known issues and deferred items
-### 3.1 — Code review comments on `harmony-fleet-deploy` (unaddressed)
+### 3.1 — Code review comments on `harmony-fleet-deploy`
-Three PR comments from the user remain open. They are real
+✓ resolved (commit `34807511 feat: refactor fleet agent config
-architectural problems, not nits:
+into a strongly typed struct, remove brittle string processing`):
- **`fleet/harmony-fleet-deploy/src/agent.rs::PodTarget`** is a
+- `PodTarget` now carries the typed `harmony_fleet_auth::
-  stringly-typed duplicate of `harmony-fleet-agent`'s
+  AgentConfig` directly — no more stringly-typed duplicate.
-  `AgentConfig`. The deploy crate should depend on the agent's
+- `render_config_map` uses `toml::to_string(&cfg)`; tested to
-  config types (or a shared types crate) and use them directly
+  round-trip TOML-special characters (`"`, `\`).
-  instead of redeclaring the schema as ad-hoc `String` fields.
+- `render_user_pass_values` is now `FleetNatsValues` + `serde_yaml
-  YAML-mud-pit in Rust clothing.
+  ::to_string`; YAML-special characters escape correctly.
- **`fleet/harmony-fleet-deploy/src/agent.rs::render_config_map`** builds the agent's `config.toml` via `format!()` with
+Remaining follow-up (not a merge blocker): `harmony/src/modules/
-  manual quote-escaping. Any label value containing `"`, `\`, or
+nats/helm_chart.rs::NatsHelmChartScore::values_yaml` still takes
-  newline produces broken TOML. Fix is `toml::to_string(&typed_struct)?` once the type plumbing from the comment above is
+a raw `String`. Lifting that to typed values is a future cleanup.
  in place.
 - **`fleet/harmony-fleet-deploy/src/nats.rs::render_user_pass_values`** builds Helm values YAML via `format!()` with raw-string interpolation. Same class of bug. Fix: typed
  `FleetNatsValues` struct (or a `serde_yaml::Value` tree) +
  `serde_yaml::to_string`. The same anti-pattern is in
  `harmony/src/modules/nats/helm_chart.rs::NatsHelmChartScore::values_yaml` (raw `String` field); lifting that to take typed
  values is the harder follow-up, but worth scoping.
 The user's framing of all three: *"it felt like a cheap
 non-programmer crappy deployment patchwork script converted to
 rust instead of a properly engineered deployment."* Fixing these
 is a small PR (probably 200 lines including the typed structs
 and tests). Should land before customer-facing v0.1, but not
 necessarily before this branch merges to master.
 ### 3.2 — Smoke-test contract (ADR-023 principle 4) deferred
@@ -324,12 +301,13 @@ message; no caller in this repo should hit it.
 ### 3.5 — Bash smoke scripts vs Rust harness
-`fleet/scripts/smoke-a{1,3,3-arm,4}.sh` are the only end-to-end
+The Rust harness now covers what `smoke-a3.sh` and
-harnesses that actually exercise the stack today. ADR-023
+`smoke-a3-arm.sh` exercised — both aarch64 (production) and
-principle 2 says "E2E uses the same Scores as production." The
+x86_64 (fast iteration) VM bring-up, podman deploy lifecycle,
-bash scripts violate that. Migrate to `harmony-fleet-e2e`-based
+and ping. The bash scripts remain as operational reference but
-Rust harnesses over time. Not a merge blocker — they're useful
+the new Rust path is the primary route. `smoke-a1.sh` / `smoke-
-operational tools today.
+a4.sh` (which exercise other paths) still don't have Rust
 equivalents — track for a follow-up PR.
 ---
@@ -345,12 +323,13 @@ re-deriving from git log:
 - **ADR-024 is the proposal for an Alternative-B capability
  decomposition**, extracted from `ROADMAP/fleet_platform/architecture_review.md` §§4–5. Marked `Status: Draft` because
  JG is not yet convinced.
- **The deploy crate's three review comments tie back to one
+- **The deploy crate's three review comments are resolved** (see
-  root cause**: values were authored as untyped strings, so the
+  §3.1) by lifting `PodTarget` / `FleetNatsScore` values onto
-  speculative enum variants (`FleetAgentTarget::Vm` /
+  typed structs serialised via `toml::to_string` /
-  `FleetNatsAuth::Callout`), the fixture-data defaults, and the
+  `serde_yaml::to_string`. The speculative enum variants
-  PR-cycle text in error messages are all *consequences*. Fix
+  (`FleetAgentTarget::Vm` / `FleetNatsAuth::Callout`) and
-  the type plumbing and the rest collapses.
+  PR-cycle text in error messages remain — separate from the
  three review comments, still flagged for review.
 - **`harmony_config` test code now uses `tokio::sync::Mutex`** for
  the `ENV_LOCK` that guards process env vars across `#[tokio::test]` awaits. Was `std::sync::Mutex` held across `.await` —
  silent deadlock waiting to happen.
@@ -372,20 +351,25 @@ re-deriving from git log:
 ## §5 — Working order
-When in doubt, do tasks roughly in this order:
+What's left between here and `git push origin master`:
-1. **Now**: §1.2 (manual component verification). Block on
+1. **Still manual, must verify before merge** — the four
-   anything that's broken there.
+   remaining §1.2 rows (Zitadel, NATS+callout, Operator, Zitadel
-2. **Now-ish**: §1.3 (drift review) and §1.4 (clippy-allow
+   enrollment). Mark the matrix with date + initials.
-   audit). Either fix or file follow-ups.
+2. **JG review calls** — §1.4 (clippy-allow audit), §1.6
-3. **Before merge**: §1.5 (operator frontend dead code), §1.6
+   (untracked items: `dev.sh`, `style/dist/`, `manual_mint/`),
-   (untracked items), §3.4 (one-line note in merge commit
+   §1.7 (ADR-024 accept/edit/reject/keep-as-draft), §1.8 (doc
-   message about the `harmony_secret` semantic).
+   cleanup remainder).
-4. **At review time**: JG decides on §1.7 (ADR-024) and §1.8
+3. **Merge commit body** — §3.4 (one-line note about the
-   (doc cleanup remainder).
+   `harmony_secret` default-store semantic change).
-5. **After merge** (follow-up PRs): §3.1 (deploy crate type
+
-   plumbing), §3.2 (smoke-test contract), §3.3 (CI for e2e),
+After merge (follow-up PRs, not blockers):
-   §3.5 (bash → Rust harnesses), §1.8 (doc cohesion PR).
+
 - §3.2 — smoke-test contract design.
 - §3.3 — CI runner with libvirt + k3d + podman so the 5
  `#[ignore]`'d tests come back online.
 - §3.5 — Rust equivalents for `smoke-a1.sh` / `smoke-a4.sh`.
 - ADR-024 migration if §1.7 lands as accept.
 This list shrinks as items resolve. Edit in place; don't append
 a changelog.
--- a/fleet/README.md
+++ b/fleet/README.md
@@ -80,6 +80,12 @@ nats --server nats://localhost:30423 --user admin --password e2e-admin \
     request "device-commands.vm-device-00-<uuid8>.ping" ""
 ```
 Or if you don't want to install the nats binary :
 ```
 alias natsbox='podman run --network=host --rm docker.io/natsio/nats-box:latest nats --server nats://localhost:30423 --user admin --password e2e-admin'
 ```
 You should see something like `{"device_id":"vm-device-00-<uuid8>","agent_version":"0.1.0","uptime_s":12}`.
 ### Cleaning up
--- a/fleet/harmony-fleet-deploy/src/main.rs
+++ b/fleet/harmony-fleet-deploy/src/main.rs
@@ -29,6 +29,8 @@ use harmony_fleet_deploy::{FleetAgentScore, FleetNatsScore, FleetOperatorScore,
    name = "harmony-fleet-deploy",
    about = "Deploy the harmony fleet stack to a Kubernetes cluster"
 )]
 // TODO all env vars should be prefixed with HARMONY and k8s namespaces should begin with
 // `harmony-` also
 struct CliConfig {
    /// Namespace every component lands in. Production override comes
    /// from `FLEET_NAMESPACE`.
--- a/fleet/harmony-fleet-deploy/src/nats.rs
+++ b/fleet/harmony-fleet-deploy/src/nats.rs
@@ -92,6 +92,12 @@ impl FleetNatsScore {
    /// callout. The defaults are deliberately weak (`admin/e2e-admin`,
    /// `device/e2e-device`); override with [`with_user_pass`].
    pub fn user_pass(namespace: impl Into<String>, node_port: u16) -> Self {
        // TODO this should be behind a feature flag, this code should not exist in the
        // production build
        // 
        // Actually to make it simpler I would hardcode the dev credentials in the e2e crate
        // and not the deployment crate. The e2e crate can easily use the score and pass it the
        // proper config or use `.with_user_pass(...)`
        Self {
            namespace: namespace.into(),
            release_name: "fleet-nats".to_string(),
--- a/fleet/harmony-fleet-e2e/README.md
+++ b/fleet/harmony-fleet-e2e/README.md
@@ -23,7 +23,7 @@ src/
 └── vm/                     # VM-target harness
    ├── stack.rs             # VmStack = infra Stack + Vec<VmDevice>
    ├── device.rs            # one libvirt VM: ProvisionVmScore + FleetDeviceSetupScore
-    ├── agent_build.rs       # cross-build the agent for aarch64-unknown-linux-gnu
+    ├── agent_build.rs       # build the agent for the requested guest arch (aarch64 cross / x86_64 native)
    └── network.rs           # libvirt default-network gateway IP discovery
 ```
@@ -32,9 +32,9 @@ Tests in `tests/` map 1:1 to scenarios:
 | File | What it asserts | Cost |
 |---|---|---|
 | `ping.rs` | Pod agent replies to `Verb::Ping` over NATS | ~30 s (k3d + image build) |
-| `vm_ping.rs` | VM agent replies to `Verb::Ping` over NATS | aarch64 VM bring-up |
+| `vm_ping.rs` | VM agent replies to `Verb::Ping` over NATS | ~75 s (x86 KVM) / ~7 min (aarch64 TCG) |
-| `vm_isolation.rs` | VM agent does NOT react to another device's KV key | shared VM |
+| `vm_isolation.rs` | VM agent does NOT react to another device's KV key | ~75 s (x86 KVM) / ~8 min (aarch64 TCG) |
-| `vm_deploy_lifecycle.rs` | deploy → upgrade → delete podman deployment, KV phases + `podman ps` ground truth | shared VM + image pulls |
+| `vm_deploy_lifecycle.rs` | deploy → upgrade → delete podman deployment, KV phases + `podman ps` ground truth | ~90 s (x86 KVM) / ~7-8 min (aarch64 TCG) |
 ## Env gates
@@ -43,8 +43,9 @@ Every test in this crate is gated so `cargo test --workspace` stays cheap.
 | Var | Purpose |
 |---|---|
 | `HARMONY_FLEET_E2E=1` | Enable the Pod-target test (`ping.rs`). Needs k3d + podman on PATH. |
-| `HARMONY_FLEET_VM_E2E=1` | Enable the VM-target tests (`vm_*`). Needs libvirt + qemu + aarch64 cross-toolchain. |
+| `HARMONY_FLEET_VM_E2E=1` | Enable the VM-target tests (`vm_*`). Needs libvirt + qemu (+ aarch64 cross-toolchain when running the default arch). |
 | `FLEET_E2E_KEEP=1` | Leave the k8s namespace + libvirt VM in place on test exit (debug). |
 | `FLEET_E2E_VM_ARCH=x86_64` | Boot an x86_64 KVM guest instead of an aarch64 TCG guest. Default `aarch64` (production target). x86 runs ~3-4× faster — useful for iteration. |
 | `RUST_LOG=...` | Standard tracing filter; default is `info`. |
 ## Running tests
@@ -55,25 +56,69 @@ Every test in this crate is gated so `cargo test --workspace` stays cheap.
 HARMONY_FLEET_E2E=1 cargo test -p harmony-fleet-e2e --test ping -- --nocapture
 ```
-### VM-target (expensive, real podman + aarch64 boot)
+### VM-target — pick aarch64 (prod parity) or x86_64 (fast iteration)
 The same three tests run against either guest arch — flip
 `FLEET_E2E_VM_ARCH`. Defaults to `aarch64` (Raspberry Pi target).
 | Path | Guest CPU | Wall-clock for `vm_ping` (warm caches) | Use when |
 |---|---|---|---|
 | `FLEET_E2E_VM_ARCH=x86_64` | native KVM | **~75 s** | dev iteration loop |
 | (default, `aarch64`) | qemu TCG emulation | **~7 min** | pre-push / CI / arch-drift catch |
 CI **must** run aarch64 — even though x86 covers the logic, a new
 crate dep with a broken aarch64 build or a podman call that segfaults
 under TCG will only surface on the real target.
 ```bash
-# One scenario at a time. Each test binary brings up its own VM
+# ---- dev iteration loop (x86_64 KVM, ~3× faster end-to-end) ----
-# (cargo runs each integration test file as a separate binary, so the
+HARMONY_FLEET_VM_E2E=1 FLEET_E2E_VM_ARCH=x86_64 RUST_LOG=info \
-# per-binary `shared_vm_stack` OnceCell does not amortize across binaries).
+    cargo test -p harmony-fleet-e2e --test vm_ping -- --nocapture
-HARMONY_FLEET_VM_E2E=1 RUST_LOG=info cargo test -p harmony-fleet-e2e --test vm_ping -- --nocapture
+HARMONY_FLEET_VM_E2E=1 FLEET_E2E_VM_ARCH=x86_64 RUST_LOG=info \
-HARMONY_FLEET_VM_E2E=1 RUST_LOG=info cargo test -p harmony-fleet-e2e --test vm_isolation -- --nocapture
+    cargo test -p harmony-fleet-e2e --test vm_isolation -- --nocapture
-HARMONY_FLEET_VM_E2E=1 RUST_LOG=info cargo test -p harmony-fleet-e2e --test vm_deploy_lifecycle -- --nocapture
+HARMONY_FLEET_VM_E2E=1 FLEET_E2E_VM_ARCH=x86_64 RUST_LOG=info \
    cargo test -p harmony-fleet-e2e --test vm_deploy_lifecycle -- --nocapture
-# All three sequentially:
+# ---- pre-push / CI (aarch64 — production target) ----
-HARMONY_FLEET_VM_E2E=1 RUST_LOG=info cargo test -p harmony-fleet-e2e \
+HARMONY_FLEET_VM_E2E=1 RUST_LOG=info \
    cargo test -p harmony-fleet-e2e --test vm_ping -- --nocapture
 HARMONY_FLEET_VM_E2E=1 RUST_LOG=info \
    cargo test -p harmony-fleet-e2e --test vm_isolation -- --nocapture
 HARMONY_FLEET_VM_E2E=1 RUST_LOG=info \
    cargo test -p harmony-fleet-e2e --test vm_deploy_lifecycle -- --nocapture
 # ---- all three sequentially (each is a separate binary → its own VM bring-up) ----
 HARMONY_FLEET_VM_E2E=1 FLEET_E2E_VM_ARCH=x86_64 RUST_LOG=info cargo test -p harmony-fleet-e2e \
    --test vm_ping --test vm_isolation --test vm_deploy_lifecycle -- --nocapture --test-threads=1
-# Everything in the crate at once (skips disabled, runs enabled):
+# ---- everything in the crate at once (pod + vm, gates honored per-test) ----
 HARMONY_FLEET_E2E=1 HARMONY_FLEET_VM_E2E=1 RUST_LOG=info \
    cargo test -p harmony-fleet-e2e -- --nocapture --test-threads=1
 ```
 ### Wall-clock breakdown (measured on this host)
 `vm_ping` from cold libvirt + cold cargo cache (one-time pain) to a
 green test:
 | Step | aarch64 TCG | x86_64 KVM | Speedup |
 |---|---|---|---|
 | Agent build (cold) | 85 s (cross) | 72 s (native) | 1.2× |
 | qemu start → DHCP | 48 s | 9 s | 5.3× |
 | sshd accepts | 9 s | <1 s | ≥10× |
 | Ansible Python detect | 15 s | 1 s | 15× |
 | `apt install podman + systemd-container` | **261 s** | **23 s** | **11.3×** |
 | FleetDeviceSetup steps 3-7 + restart | ~50 s | ~4 s | ~12× |
 | `wait_until_ready` ping retry | ~2 s | <1 s | 2× |
 | **Total test future (`finished in …s`)** | **440 s** | **149 s** | **2.95×** |
 The single biggest swing is `apt install podman` inside the guest:
 4 min 21 s on TCG vs 23 s on KVM. The whole-test 2.95× speedup is
 because cold cargo cross-build and cargo native build are comparable
 (~80 s either way) — the in-guest work is where the x86 path
 collapses. **Warm-cache iteration is closer to 6× because the cargo
 build vanishes.**
 ### Debugging a failed bring-up
 ```bash
@@ -138,6 +183,3 @@ bring-up.
  `FleetNatsScore::user_pass` mode. The Zitadel-JWT path is
  exercised by `examples/fleet_e2e_demo` (currently `#[ignore]`'d
  pending a CI runner with full bring-up capacity).
 - **x86_64 VM bring-up.** Locked to aarch64 because that's the
  production target. An x86_64 fast-path can be added by widening
  `VmStackOptions::arch`; out of scope today.
--- a/fleet/harmony-fleet-e2e/src/vm/agent_build.rs
+++ b/fleet/harmony-fleet-e2e/src/vm/agent_build.rs
@@ -1,26 +1,31 @@
-//! Cross-build the fleet agent binary for an aarch64 Linux guest.
+//! Build the fleet agent binary for a target VM architecture.
 //!
-//! Mirrors `fleet/scripts/smoke-a3-arm.sh` phase 2 in Rust: ensure
+//! Two paths:
 //! the `aarch64-unknown-linux-gnu` rustup target is installed, then
 //! `cargo build --release --target aarch64-unknown-linux-gnu -p
 //! harmony-fleet-agent`. Returns the path to the resulting binary
 //! so `FleetDeviceSetupScore` can upload it.
 //!
-//! Prereq the harness intentionally does **not** install for the
+//! - **aarch64** — cross-build via `cargo build --release --target
-//! operator: a working aarch64 GNU cross-toolchain on the host
+//!   aarch64-unknown-linux-gnu -p harmony-fleet-agent`. Requires the
-//! (Arch: `aarch64-linux-gnu-gcc`; Debian/Ubuntu:
+//!   `aarch64-unknown-linux-gnu` rustup target *and* a GNU cross-linker
-//! `gcc-aarch64-linux-gnu`). Without it, `cargo build` fails with
+//!   on the host (Arch: `aarch64-linux-gnu-gcc`; Debian/Ubuntu:
-//! a link error we surface verbatim.
+//!   `gcc-aarch64-linux-gnu`). Mirrors `fleet/scripts/smoke-a3-arm.sh`
 //!   phase 2.
 //! - **x86_64** — native host build via `cargo build --release -p
 //!   harmony-fleet-agent`. No `--target`, no rustup add, no
 //!   cross-linker. The same binary the Pod-target path consumes,
 //!   reused here for the faster-but-non-Pi VM smoke.
 //!
 //! The aarch64 path matches the production Raspberry Pi target byte
 //! for byte; the x86_64 path is for fast-iteration tests where the
 //! arch difference doesn't matter.
 use std::path::{Path, PathBuf};
 use std::process::Stdio;
 use harmony::topology::VmArchitecture;
 use thiserror::Error;
 use tokio::process::Command;
-/// Rust target triple used for the on-VM agent. aarch64-Linux-GNU
+/// Rust target triple for the aarch64 cross-build.
-/// matches the Ubuntu 24.04 cloud image the harness boots.
+pub const AGENT_AARCH64_TARGET_TRIPLE: &str = "aarch64-unknown-linux-gnu";
 pub const AGENT_TARGET_TRIPLE: &str = "aarch64-unknown-linux-gnu";
 #[derive(Debug, Error)]
 pub enum AgentBuildError {
@@ -30,24 +35,36 @@ pub enum AgentBuildError {
        #[source]
        source: std::io::Error,
    },
-    #[error("`rustup target add {AGENT_TARGET_TRIPLE}` failed (rc={rc}): {stderr}")]
+    #[error("`rustup target add {AGENT_AARCH64_TARGET_TRIPLE}` failed (rc={rc}): {stderr}")]
    RustupAdd { rc: i32, stderr: String },
    #[error(
-        "`cargo build` for harmony-fleet-agent (target {AGENT_TARGET_TRIPLE}) failed (rc={rc}). \
+        "`cargo build` for harmony-fleet-agent (target {target}) failed (rc={rc}). \
-             The most common cause is a missing aarch64 GNU cross-linker — install one (Arch: \
+         For the aarch64 cross-build, the most common cause is a missing GNU cross-linker \
-             `aarch64-linux-gnu-gcc`; Debian/Ubuntu: `gcc-aarch64-linux-gnu`) and re-run."
+         (Arch: `aarch64-linux-gnu-gcc`; Debian/Ubuntu: `gcc-aarch64-linux-gnu`)."
    )]
-    CargoBuild { rc: i32 },
+    CargoBuild { target: String, rc: i32 },
    #[error("agent binary not produced at expected path {path}")]
    MissingArtifact { path: String },
 }
-/// Build (or rebuild, cargo-cached) the aarch64 agent binary and
+/// Build the fleet agent for the requested guest architecture and
-/// return its on-disk path. Cheap on warm cache; first run is the
+/// return its on-disk path. Routes to the arch-specific builder.
-/// expensive one.
+pub async fn build_agent_for(
    arch: VmArchitecture,
    workspace_root: &Path,
 ) -> Result<PathBuf, AgentBuildError> {
    match arch {
        VmArchitecture::Aarch64 => build_agent_for_aarch64(workspace_root).await,
        VmArchitecture::X86_64 => build_agent_for_x86_64(workspace_root).await,
    }
 }
 /// Cross-build for aarch64-Linux-GNU. The on-disk path lives under
 /// `target/aarch64-unknown-linux-gnu/release/` so it doesn't collide
 /// with the host's native build.
 pub async fn build_agent_for_aarch64(workspace_root: &Path) -> Result<PathBuf, AgentBuildError> {
    let rustup = Command::new("rustup")
-        .args(["target", "add", AGENT_TARGET_TRIPLE])
+        .args(["target", "add", AGENT_AARCH64_TARGET_TRIPLE])
        .stdout(Stdio::null())
        .stderr(Stdio::piped())
        .output()
@@ -64,22 +81,19 @@ pub async fn build_agent_for_aarch64(workspace_root: &Path) -> Result<PathBuf, A
    }
    tracing::info!(
-        target = AGENT_TARGET_TRIPLE,
+        target = AGENT_AARCH64_TARGET_TRIPLE,
-        "cargo build --release -p harmony-fleet-agent (cross-build)",
+        "cargo build --release -p harmony-fleet-agent (cross-build aarch64)",
    );
    let build = Command::new("cargo")
        .args([
            "build",
            "--release",
            "--target",
-            AGENT_TARGET_TRIPLE,
+            AGENT_AARCH64_TARGET_TRIPLE,
            "-p",
            "harmony-fleet-agent",
        ])
        .current_dir(workspace_root)
        // Inherit stderr so cargo's progress + any linker error
        // lands on the test runner's console exactly as it would
        // on the command line.
        .stderr(Stdio::inherit())
        .stdout(Stdio::inherit())
        .status()
@@ -90,13 +104,51 @@ pub async fn build_agent_for_aarch64(workspace_root: &Path) -> Result<PathBuf, A
        })?;
    if !build.success() {
        return Err(AgentBuildError::CargoBuild {
            target: AGENT_AARCH64_TARGET_TRIPLE.to_string(),
            rc: build.code().unwrap_or(-1),
        });
    }
    let bin = workspace_root
        .join("target")
        .join(AGENT_AARCH64_TARGET_TRIPLE)
        .join("release")
        .join("harmony-fleet-agent");
    if !bin.exists() {
        return Err(AgentBuildError::MissingArtifact {
            path: bin.display().to_string(),
        });
    }
    Ok(bin)
 }
 /// Native build for x86_64. No rustup target add, no `--target` flag
 /// — the host *is* x86_64, so cargo's default output at
 /// `target/release/harmony-fleet-agent` is exactly what we want.
 /// Assumes the test harness runs on an x86_64 host; calling this on
 /// a non-x86 host produces a binary that won't boot in the guest.
 pub async fn build_agent_for_x86_64(workspace_root: &Path) -> Result<PathBuf, AgentBuildError> {
    tracing::info!("cargo build --release -p harmony-fleet-agent (native x86_64)");
    let build = Command::new("cargo")
        .args(["build", "--release", "-p", "harmony-fleet-agent"])
        .current_dir(workspace_root)
        .stderr(Stdio::inherit())
        .stdout(Stdio::inherit())
        .status()
        .await
        .map_err(|source| AgentBuildError::Spawn {
            cmd: "cargo".to_string(),
            source,
        })?;
    if !build.success() {
        return Err(AgentBuildError::CargoBuild {
            target: "x86_64-unknown-linux-gnu (native)".to_string(),
            rc: build.code().unwrap_or(-1),
        });
    }
    let bin = workspace_root
        .join("target")
        .join(AGENT_TARGET_TRIPLE)
        .join("release")
        .join("harmony-fleet-agent");
    if !bin.exists() {
--- a/fleet/harmony-fleet-e2e/src/vm/mod.rs
+++ b/fleet/harmony-fleet-e2e/src/vm/mod.rs
@@ -22,10 +22,13 @@ pub mod device;
 pub mod network;
 pub mod stack;
-pub use agent_build::{AGENT_TARGET_TRIPLE, AgentBuildError, build_agent_for_aarch64};
+pub use agent_build::{
    AGENT_AARCH64_TARGET_TRIPLE, AgentBuildError, build_agent_for, build_agent_for_aarch64,
    build_agent_for_x86_64,
 };
 pub use device::{VmDevice, VmDeviceError, VmDeviceOptions};
 pub use network::{NetworkLookupError, libvirt_default_gateway_ip};
 pub use stack::{
-    LIBVIRT_NETWORK, LIBVIRT_URI, VM_NAME_PREFIX, VmBringUpError, VmReadyError, VmStack,
+    ENV_VM_ARCH, LIBVIRT_NETWORK, LIBVIRT_URI, VM_NAME_PREFIX, VmBringUpError, VmReadyError,
-    VmStackOptions, shared_vm_stack,
+    VmStack, VmStackOptions, shared_vm_stack,
 };
--- a/fleet/harmony-fleet-e2e/src/vm/stack.rs
+++ b/fleet/harmony-fleet-e2e/src/vm/stack.rs
@@ -27,7 +27,7 @@ use tokio::sync::OnceCell;
 use uuid::Uuid;
 use crate::stack::{BringUpError, NATS_NODE_PORT, Stack, StackOptions, shared_stack};
-use crate::vm::agent_build::{AgentBuildError, build_agent_for_aarch64};
+use crate::vm::agent_build::{AgentBuildError, build_agent_for};
 use crate::vm::device::{VmDevice, VmDeviceError, VmDeviceOptions};
 use crate::vm::network::{NetworkLookupError, libvirt_default_gateway_ip};
@@ -82,11 +82,34 @@ impl Default for VmStackOptions {
    }
 }
 /// Env var that lets tests pick a guest arch at runtime without a
 /// recompile. Accepts `aarch64`/`arm64` and `x86_64`/`x86-64`.
 /// Unset = defaults to aarch64 (production target).
 pub const ENV_VM_ARCH: &str = "FLEET_E2E_VM_ARCH";
 impl VmStackOptions {
    /// Read env overrides (today: just [`ENV_VM_ARCH`]) and apply
    /// them on top of [`Default`]. Returns the canonical "what the
    /// test asked for" struct, so tests don't have to re-implement
    /// env parsing.
    pub fn from_env() -> Self {
        let mut opts = Self::default();
        if let Ok(raw) = std::env::var(ENV_VM_ARCH) {
            match raw.to_ascii_lowercase().as_str() {
                "aarch64" | "arm64" => opts.arch = VmArchitecture::Aarch64,
                "x86_64" | "x86-64" | "x86" | "amd64" => opts.arch = VmArchitecture::X86_64,
                other => panic!("{ENV_VM_ARCH}={other:?} not recognized — use aarch64 or x86_64"),
            }
        }
        opts
    }
 }
 #[derive(Debug, Error)]
 pub enum VmBringUpError {
    #[error("infra bring-up: {0}")]
    Infra(#[from] BringUpError),
-    #[error("aarch64 agent cross-build: {0}")]
+    #[error("agent build: {0}")]
    AgentBuild(#[from] AgentBuildError),
    #[error("libvirt gateway IP discovery: {0}")]
    GatewayIp(#[from] NetworkLookupError),
@@ -154,9 +177,11 @@ impl VmStack {
        //    place.
        let infra = shared_stack(StackOptions::infra_only()).await?;
-        // 2. Cross-build the aarch64 agent binary once for all VMs.
+        // 2. Build the agent binary for the requested guest arch.
        //    aarch64 cross-builds; x86_64 takes the host's native
        //    output.
        let workspace_root = workspace_root_from_env();
-        let agent_binary = build_agent_for_aarch64(&workspace_root).await?;
+        let agent_binary = build_agent_for(opts.arch, &workspace_root).await?;
        // 3. Discover the libvirt gateway IP so the VM can reach
        //    the host's NATS NodePort.
--- a/fleet/harmony-fleet-e2e/tests/vm_deploy_lifecycle.rs
+++ b/fleet/harmony-fleet-e2e/tests/vm_deploy_lifecycle.rs
@@ -51,7 +51,7 @@ async fn vm_agent_drives_full_deploy_lifecycle() -> anyhow::Result<()> {
        )
        .try_init();
-    let stack = shared_vm_stack(VmStackOptions::default()).await?;
+    let stack = shared_vm_stack(VmStackOptions::from_env()).await?;
    stack.print_debug_info();
    stack.wait_until_ready(Duration::from_secs(60)).await?;
--- a/fleet/harmony-fleet-e2e/tests/vm_isolation.rs
+++ b/fleet/harmony-fleet-e2e/tests/vm_isolation.rs
@@ -50,7 +50,7 @@ async fn agent_ignores_other_devices_keys() -> anyhow::Result<()> {
        )
        .try_init();
-    let stack = shared_vm_stack(VmStackOptions::default()).await?;
+    let stack = shared_vm_stack(VmStackOptions::from_env()).await?;
    stack.print_debug_info();
    stack.wait_until_ready(Duration::from_secs(60)).await?;
--- a/fleet/harmony-fleet-e2e/tests/vm_ping.rs
+++ b/fleet/harmony-fleet-e2e/tests/vm_ping.rs
@@ -37,7 +37,7 @@ async fn agent_on_vm_replies_to_ping() -> anyhow::Result<()> {
        )
        .try_init();
-    let stack = shared_vm_stack(VmStackOptions::default()).await?;
+    let stack = shared_vm_stack(VmStackOptions::from_env()).await?;
    stack.print_debug_info();
    // `FleetDeviceSetupScore` returns when the systemd unit is