Adds `examples/fleet_e2e_demo/` — composes fleet_auth_callout's
existing pieces (Zitadel + auth callout deploy) with per-device
machine-user provisioning (one ZitadelSetupScore call per VM) and
FleetDeviceSetupScore using FleetDeviceAuth::ZitadelJwt. The harness
expects pre-provisioned libvirt VMs (one per device) reachable via
`FLEET_E2E_VM_<i>_IP` env vars; full VM provisioning via
ProvisionVmScore is a follow-up — keeping the harness observable in
pieces during the cold-start debugging tomorrow.
Constituent helpers in `fleet_auth_callout::lib.rs` flipped from
private to `pub` (deploy_zitadel, wait_for_zitadel_ready,
ensure_issuer_seed, build_and_load_callout_image, etc.) so the new
harness composes them rather than re-implementing.
`bring_up_full_stack`:
1. Ensure k3d cluster (re-uses fleet_auth_callout's create_k3d).
2. Deploy Zitadel + Postgres.
3. CoreDNS rewrite + wait for Zitadel HTTP + wait for the
chart-provisioned `iam-admin-pat` secret. (Last step is new and
load-bearing — without it ZitadelSetupScore races the chart's
setup job and fails on first cold-run.)
4. ZitadelSetupScore for project + API app + roles + admin
machine-user (admin gets fleet-admin role grant).
5. Issuer NKey from a persisted secret + NATS deploy with
auth_callout block + callout pod.
6. For each device i: per-device ZitadelSetupScore (machine-user
with `device` role grant), pull the JSON keyfile from cache,
render the agent's TOML with the keyfile path. (FleetDeviceSetupScore
invocation is wired structurally; the SSH-and-apply step is
gated behind the VM provisioning follow-up.)
`HostsEntry` + `merge_hosts_file` added to FleetDeviceSetupScore so
VMs on a libvirt NAT can resolve `sso.fleet.local` to the host
gateway. Managed-block markers in /etc/hosts make the merge
idempotent across re-runs and removable when entries are dropped
from the score. Four new unit tests cover the merge invariants
(insert, replace, strip, byte-stable).
Tests skeleton in `tests/e2e_walking_skeleton.rs`:
- `both_devices_heartbeat_within_60s` — implemented; reads from
device-info KV via admin token.
- `admin_jwt_reads_any_device_subject` — implemented; subscribes
to `device-state.>` as admin.
- `cross_device_isolation_enforced_in_vm` — `#[ignore]` pending
per-device-key plumbing through E2eHandles.
- `agent_recovers_from_nats_pod_restart` — `#[ignore]` pending
the NATS-pod-restart driver.
The two `#[ignore]`d tests cover the load-bearing reconnect and
isolation invariants. Wiring them is the morning-of-rehearsal
priority since those are the customer-facing claims.
Out of scope of this commit (called out in the roadmap doc):
- ProvisionVmScore integration (today operator runs fleet_vm_setup
out-of-band).
- Operator install via Helm (smoke-a4 runs operator host-side; this
harness inherits that pattern).
- Full SSH-based agent install via FleetDeviceSetupScore — Score
built, invocation gated.