feat/iot-arm-vm #269
Reference in New Issue
Block a user
No description provided.
Delete Branch "feat/iot-arm-vm"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Structural changes (the biggest items from the review): - `HostConfigurationProvider` split into five narrower capabilities: `HostReachable`, `PackageInstaller`, `FileDelivery`, `UnixUserManager`, `SystemdManager`. Each implementation now only implements what it can actually deliver — a future cloud-init / ignition / podman-agent backend can pick a subset without inheriting systemd assumptions it can't honour. Added an umbrella trait `LinuxHostConfiguration` blanket-impl'd for any type that has all five, so Scores keep a single bound. - New `VirtualMachineHost` capability in domain/topology/: `list_vms` / `ensure_vm` / `delete_vm` / `get_vm_info`, with generic `VirtualMachineSpec` carrying a typed optional `VmFirstBootConfig` (hostname, admin user, authorized keys). `KvmHost` trait and `KvmHostTopology` deleted; `KvmVirtualMachineHost` is the concrete libvirt implementation. Cloud-init stays a KVM-impl detail — callers never see it. - `KvmVmScore` + `CloudInitVmConfig` deleted; replaced by a generic `ProvisionVmScore` in `modules::iot::vm_score` bound to `T: VirtualMachineHost`. The Score itself has no knowledge of the hypervisor or its first-boot delivery mechanism. - `IotDeviceSetupConfig.device_id` is now `harmony_types:🆔:Id` (timestamp-prefixed, sortable-by-creation, collision-safe). - `ensure_ready` on `KvmVirtualMachineHost` is a Noop with a TODO pointing at ROADMAP/12-code-review-april-2026.md §12.1 (phased topology). Captures the concern about eagerly probing the hypervisor even when the current run doesn't need KVM. Code quality fixes from the line-level comments: - `render_toml` / `render_systemd_unit` / `render_user_data` rewritten as `format!` with raw-string templates (no more push_str chains). - Every `Command::new(…).arg().arg().arg()` chain in the touched files converted to `.args([…])`. - Ansible module args are now typed Rust structs (`AptArgs`, `AnsibleFileArgs`, `AnsibleUserArgs`, `AnsibleCopyArgs`, `AnsibleSystemdArgs`, `AnsibleCommandArgs`, `AnsibleStatArgs`) serialized via `serde_json::to_value`. No more `json!` macros with ad-hoc string keys. - `ensure_linger`: no more shell sentinel. Uses `ansible.builtin.stat` on `/var/lib/systemd/linger/<user>` for the idempotent change-state check, then `ansible.builtin.command loginctl enable-linger` only on miss. `loginctl` is required (not just `file state=touch`) because systemd-logind needs the dbus signal to actually start the user manager; a plain file touch doesn't wake it up and every subsequent `systemctl --user …` fails with "Failed to connect to bus". Documented in-place. - `ensure_user_unit_active`: picks up the user's UID first via `ansible.builtin.command id -u <user>` and wraps the `systemctl --user enable --now <unit>` invocation in `env XDG_RUNTIME_DIR=/run/user/<UID>`. The systemd module's task-level `environment:` keyword isn't available in ad-hoc mode; this is the cleanest equivalent. Documented the inline-playbook path as a future when we get more task-level- env callsites. - `ensure_package` comment clarified: distro dispatch is this function's job; Debian-family is the first concrete target and extending to RHEL/Fedora/Alpine is an implementation detail, not a capability change. - Kubespray line removed. Verified: from a primed `$HARMONY_DATA_DIR/iot/`, smoke-a3.sh still completes all 5 phases (bootstrap + provision + 9 setup changes + initial NATS status + power-cycle recovery).Ansible's `command` module is a Python-wrapped SSH round trip with zero added value when the operation isn't built around Ansible's idempotency primitives. `russh` is already a workspace dep and gives us the exit code + stdout + stderr in a typed struct, with one round trip. Moving the two call sites that were using `ansible.builtin.command` to russh directly: - New `modules::linux::ssh_executor::ssh_exec(host, creds, cmd)` returning `SshCommandOutput { rc, stdout, stderr }`. Loads the private key via `russh::keys::load_secret_key`, authenticates, opens an exec channel, drains all `ChannelMsg` until the channel closes, returns the collected data. Draining past `Eof` matters: some sshd implementations emit `ExitStatus` *after* `Eof`, and an early break loses the rc. - `ensure_linger`: `test -e /var/lib/systemd/linger/<user>` over russh for the check, then `sudo loginctl enable-linger <user>` only on miss. Two SSH round trips, no Ansible. Same semantics as the previous `stat` + `command` pair but without the Python hop. - `ensure_user_unit_active`: `id -u <user>` + `sudo -u <user> env XDG_RUNTIME_DIR=/run/user/<uid> systemctl --user enable --now <unit>`. This is the case that couldn't be done cleanly via ad-hoc `ansible.builtin.systemd` in the first place because task-level `environment:` isn't available in ad-hoc; russh makes it a one-liner. Ansible still owns: `apt` (distro dispatch + cache), `user` (idempotent account management), `copy` (file delivery with content-diff change reporting), `file` (directory/mode), `systemd` (daemon-reload + enable + start as one atomic call). Those are where `ansible`'s value is real; `command` was a category error. Verified: smoke-a3 PASS end-to-end — same 9-change initial setup, NATS status, and power-cycle recovery as before.Adds the type-safe arch dimension for the aarch64-on-x86_64 emulation work to follow. No behaviour change: every existing call site gets `VmArchitecture::X86_64` via `Default`, and the XML renderer (unchanged in this commit) emits the same bytes it always did. - `VmArchitecture { X86_64 (default), Aarch64 }` in domain/topology/virtualization.rs, with `as_str()` and `ubuntu_cloudimg_suffix()` helpers (Ubuntu uses `amd64`/`arm64` in filenames, not the `uname -m` spelling). - `VirtualMachineSpec.architecture` + `#[serde(default)]` for on-disk compat. - `VmConfig.architecture` + `VmConfig.firmware: Option<UefiFirmware>` in modules/kvm/types.rs. `UefiFirmware { code, vars }` is the typed pair libvirt's `<loader>` + `<nvram>` need for aarch64 guests; x86_64 leaves it None. `VmConfigBuilder::architecture()` / `firmware()` setters added. - `KvmVirtualMachineHost::ensure_vm` threads the arch through to VmConfig; firmware wiring is commit 3. Re-exported: `VmArchitecture`, `UefiFirmware` from `modules::kvm`. `VmArchitecture` is a type-alias re-export from domain/topology so the arch enum lives in one place. Verified: cargo check clean, fmt clean, aarch64 cross-compile of harmony + iot crates still green.aarch64 guests boot via UEFI — there is no SeaBIOS equivalent for the arm64 `virt` machine type. Libvirt needs two paths: - CODE (read-only firmware image, shared across VMs) - VARS (writable NVRAM, per-VM) Every distro ships these under a different filename. New module `modules/kvm/firmware.rs`: - `AarchFirmware { code, vars_template }` — typed pair. - `discover_aarch64_firmware()` walks four known-paths groups (Arch `edk2-armvirt`, Arch old naming, Debian/Ubuntu `qemu-efi-aarch64`, Fedora `edk2-aarch64`). First pair where both files exist wins. Miss → `ExecutorError` carrying the per-distro `pacman`/`apt`/`dnf` install command + the full candidate list for diagnosis. - `copy_vars_template_for_vm(fw, dest)` produces the per-VM NVRAM at `$pool/<vm>-VARS.fd` and chmods 0644 so libvirt-qemu's dynamic-ownership chown on VM start works. Wired into `KvmVirtualMachineHost::ensure_vm`: when `spec.architecture == Aarch64`, the topology runs firmware discovery + per-VM copy before composing the `VmConfig`, then hands the resolved `UefiFirmware` to the XML renderer (commit 2 already consumes it). x86_64 path unchanged. Firmware discovery is deliberately a runtime check with a clear error, not a preflight — this lets x86_64-only runs succeed on hosts without AAVMF installed. Commit 4 adds an arch-aware preflight that surfaces it upfront when a caller asks for aarch64. Verified: 26/26 kvm::xml tests still green, cargo check clean, cargo fmt clean.Wire the VmArchitecture story all the way to the user-facing entry points so an arm64 smoke run is a single env flip. Example (`example_iot_vm_setup`): * New `--arch {x86-64|aarch64}` flag (default x86-64) backed by a `CliArch` enum that converts cleanly to `VmArchitecture`. * Preflight and cloud-image bootstrap now call the `_for_arch` variants, and the `VirtualMachineSpec.architecture` field gets the real value instead of `Default::default()`. Smoke script (`iot/scripts/smoke-a3.sh`): * Reads `ARCH=x86-64|aarch64` from env (default x86-64). * When `ARCH=aarch64`, `rustup target add aarch64-unknown-linux-gnu` + `cargo build --target ...` produces an arm64 agent binary; otherwise the existing host-target build path is kept. * Threads `--arch` to the example. * Extends the phase-4 initial-status timeout (60s → 300s) and the phase-5 post-reboot wait (240s → 900s) under TCG, which runs 3-5× slower than native KVM. New `smoke-a3-arm.sh` wrapper: exports `ARCH=aarch64` and a separate `VM_NAME` / NATS container name so an arm smoke run can coexist with an x86 one on the same host without stepping on libvirt state. Topology side (`KvmVirtualMachineHost::ensure_vm`): `wait_for_ip` timeout is now arch-derived — 300s for x86_64, 900s for aarch64 — because first-boot cloud-init under TCG routinely needs 8-12 min on a constrained worker.The on-device agent builds `harmony` with `default-features = false, features = ["podman"]`, which does not pull in the `kvm` feature. Cross-compiling iot-agent-v0 for `aarch64-unknown-linux-gnu` to put it on a Pi / arm64 VM currently fails with: error[E0433]: failed to resolve: could not find `kvm` in `modules` --> harmony/src/modules/iot/preflight.rs:18:21 use crate::modules::kvm::firmware::discover_aarch64_firmware; Gate the import and the `discover_aarch64_firmware()` call inside `check_iot_smoke_preflight_for_arch` behind `#[cfg(feature = "kvm")]`. Callers who build `harmony` without kvm (the agent) still get the `qemu-system-aarch64` PATH check — the firmware probe only matters to the host that will actually boot the VM, and that host always builds with `kvm` enabled anyway. Verification: `cargo build --release --target aarch64-unknown-linux-gnu -p iot-agent-v0` now succeeds and produces a valid ELF aarch64 binary (~13 MB).Current Arch edk2-armvirt ships the pair as /usr/share/edk2/aarch64/QEMU_EFI.fd /usr/share/edk2/aarch64/QEMU_VARS.fd (plus a compatibility copy under /usr/share/edk2-armvirt/aarch64/). The previous CANDIDATES list looked for `QEMU_CODE.fd` and `vars-template-pflash.raw` — neither name matches the actual distro layout, so `discover_aarch64_firmware` reported "no firmware found" on a fully-provisioned Arch host. Add the `QEMU_EFI.fd` + `QEMU_VARS.fd` pair at both Arch paths at the top of the probe order; keep the older raw-pflash variant and the speculative CODE/VARS naming as later fallbacks. Sync the error message's "checked paths" hint with the new list so the diagnostic matches what's actually probed. Verified against /usr/share/edk2/aarch64/QEMU_{EFI,VARS}.fd on this host — `discover_aarch64_firmware` now returns the pair and `cargo run -p example_iot_vm_setup -- --arch aarch64 --bootstrap-only` completes (downloads + sha256-verifies the 598 MB arm64 image and caches it under $HARMONY_DATA_DIR/iot/cloud-images/).Three fixes landed during arm smoke debugging. Each is a real correctness / perf issue that would bite anyone running aarch64 under TCG via libvirt, independent of any particular firmware. **xml.rs — qemu:commandline overrides for -cpu and -accel** `pauth-impdef=on` is a QEMU property of `-cpu max`, not a libvirt `<feature>` entry. Putting it under `<cpu><feature policy='require' name='pauth-impdef'/>` is rejected by libvirt with: error: unsupported configuration: unknown CPU feature: pauth-impdef Route it instead via `<qemu:commandline>` (with the qemu namespace declared on `<domain>`). QEMU takes the LAST `-cpu` arg as authoritative, so libvirt's `-cpu max` followed by our `-cpu max,pauth-impdef=on` yields max + pauth-impdef. Same mechanism forces MTTCG: despite docs claiming QEMU ≥ 9.1 defaults to `thread=multi` on aarch64, observation on QEMU 10.2 shows cross-arch `-accel tcg` runs single-threaded (`vcpu.1.time` stays at 0 forever). Appending `-accel tcg,thread=multi` creates a real per-vcpu thread and roughly halves cold-boot wall time. Also added a `<rng model='virtio'>` device feeding host `/dev/urandom`. aarch64 cloud-init blocks minutes on first-boot SSH host-key generation without it under TCG (entropy pool never fills on its own). Cheap insurance on x86_64 too. **topology.rs — 30-min wait_for_ip budget for aarch64** Cold boot under TCG on an 8-core x86 host is 10-15 min even with virtio-rng + pauth-impdef + MTTCG. The previous 900s ceiling trips healthy boots; 1800s covers slower CI workers. **smoke-a3.sh — cleanup must pass --nvram** `virsh undefine --remove-all-storage` refuses to remove an aarch64 domain without `--nvram`, because NVRAM files aren't considered "storage." Before this, a failed run left the domain definition behind with yesterday's XML — subsequent runs would replay the stale XML (ensure_vm is idempotent and doesn't redefine when the domain already exists), masking any XML change until a manual `virsh undefine` was issued. Also bump REBOOT_STEPS to match the new topology-side budget. Verified: `cargo test -p harmony --lib kvm::xml` passes (26/26), including the 5 aarch64 assertions (namespace, cpu block, pflash wiring, qemu:commandline contents for both -cpu and -accel).QEMU's `virt` machine hardwires pflash unit 0 as a CFI flash device of fixed size 64 MiB. When libvirt's `<loader type='pflash'>` points at a file smaller than that, qemu refuses to start: cfi.pflash01 device '/machine/virt.flash0' requires 67108864 bytes, block backend provides 3145728 bytes Different distros ship the CODE firmware differently: - Pre-padded (upstream QEMU pc-bios/edk2-aarch64-code.fd, Debian/ Ubuntu qemu-efi-aarch64): file is exactly 64 MiB, zero-padded at the tail. Works as-is with libvirt's pflash loader. - Raw edk2 build output (Arch `edk2-aarch64 202508+`): file is ~2-4 MiB, just the firmware volume without pflash padding. Has to be padded before libvirt accepts it. Our discovery previously handed the discovered path straight to libvirt. That works on pre-padded distros and silently fails on raw-output distros. Add `ensure_code_pflash_padded` in modules/kvm/firmware.rs: - If the source is already 64 MiB, return the path unchanged — no copy, no bytes moved. - If smaller, check a cache path (pool_dir/aarch64-code-padded.fd) for a correctly-sized copy newer than the source and reuse it. - Otherwise copy + `File::set_len(64 MiB)` (sparse zero pad, one syscall), chmod 0644, return the cached path. - If larger than 64 MiB, error out — no amount of padding saves us. `ensure_vm_firmware` in topology.rs now runs the discovered code through the padder before handing it to libvirt. One padded copy per pool, reused across every aarch64 VM on that pool. Verification path: `cargo test -p harmony --lib kvm::` passes (26 tests — XML suite unchanged since this is runtime-only).`wait_for_ip` returns as soon as libvirt sees a DHCP lease, but the guest may still be minutes away from accepting SSH connections — cloud-init is usually mid-firstboot (SSH host-key generation, runcmd, etc.). Any Score that SSHes in immediately after `ensure_vm` resolves races with sshd startup: ansible.builtin.ping failed against 192.168.122.11: UNREACHABLE! ssh: connect to host 192.168.122.11 port 22: Connection refused This is painful on native KVM (seconds) and catastrophic under TCG (1-3 min between DHCP and sshd listening). When `spec.first_boot.is_some()` — i.e. the caller asked us to run cloud-init and therefore almost certainly intends to SSH next — also block on `wait_for_tcp_port(ip, 22, budget)` before returning. The budget is reused from `wait_for_ip` (300 s x86_64 / 1800 s aarch64) because if cloud-init takes that long to bring SSH up, something is broken that a longer wait wouldn't fix. `wait_for_tcp_port` uses 1 s backoff polling with a 5 s per-attempt TCP connect timeout, so a silently dropped SYN doesn't burn half the budget on a single hung syscall. Cases without `first_boot` (caller bringing their own pre-baked image and not expecting SSH) get the old behavior: return as soon as DHCP resolves.@@ -0,0 +81,4 @@.cloned()}/// Back-compat shim — returns the x86_64 image. PreferThis should leverage harmony_assets crate, 90% of the code here is duplication.
@@ -0,0 +242,4 @@info!("generating ed25519 ssh keypair at {priv_path:?} (one-time)");let status = Command::new("ssh-keygen").arg("-t")never list arg .arg.arg.arg.arg use .args([...]) instead.