feat/iot-operator-helm-chart #272

johnride · 2026-04-22T13:46:28Z

johnride commented

2026-04-22 13:46:28 +00:00

No description provided.

johnride added 23 commits 2026-04-22 13:46:29 +00:00

docs(iot): forward plan (v0.1 and beyond) + mark v0 walking skeleton as SHIPPED d4c8731941

v0 walking skeleton is substantially done (CRD → operator → NATS KV
→ on-device agent → podman reconcile; VM-as-device for x86_64 and
aarch64 via TCG; power-cycle resilience; operator install via Score
instead of yaml/kubectl). Time to switch the `ROADMAP/iot_platform`
folder from "plan to build the skeleton" to "plan to build on top of
the skeleton."

- **NEW** `ROADMAP/iot_platform/v0_1_plan.md` — the authoritative
  forward plan. Five chapters in execution order:
    1. Hands-on end-to-end demo the user can drive by hand
       (imminent, fully detailed: composed smoke, typed-Rust CR
       applier, natsbox command menu, in-cluster NATS).
    2. Status reflect-back + inventory (enrich `AgentStatus`,
       operator aggregates into `.status.aggregate`).
    3. Helm chart packaging (ArgoCD deferred — user's clusters have
       it already, bringing it into the smoke adds no validation
       value).
    4. Zitadel + OpenBao + per-device auth.
    5. Frontend (web / CLI / TUI — deferred).

  Chapters 2-5 are sketched; they expand to their own docs as each
  becomes the active chapter.

- **EDIT** `ROADMAP/iot_platform/v0_walking_skeleton.md` — add a
  SHIPPED banner at the top pointing at v0_1_plan.md. Keep the
  707-line design diary intact as archaeology; don't rewrite
  history.

- Incorporates the post-v0 architectural principles that emerged
  from review (no yaml in framework paths, minimal ad-hoc
  topologies, cross-boundary types in harmony-reconciler-contracts,
  verify before blaming upstream).

feat(k8s): K8sBareTopology — minimal topology for ad-hoc Score execution 6863162655

Roadmap §12.6 ("topology proliferation") is partially resolved by
extracting the ad-hoc InstallTopology from iot-operator-v0/install.rs
into harmony as a reusable shared type, now that a second consumer
(NatsBasicScore, landing next) makes the extraction genuinely
load-bearing rather than speculative.

What's new:

- harmony/src/modules/k8s/bare_topology.rs — K8sBareTopology carries
  one K8sClient, implements K8sclient + Topology (noop ensure_ready).
  Constructors: from_client(name, client) for callers building their
  own client, from_kubeconfig(name) for callers reading the standard
  KUBECONFIG chain.
- modules::k8s::K8sBareTopology re-export.

What's gone:

- iot-operator-v0/src/install.rs: the ~30-line InstallTopology struct
  + its async_trait-decorated impls. The crate also drops async-trait
  and harmony-k8s as direct deps (neither is used now that the
  topology is shared).
- Long "architectural smell" comment from install.rs — the smell is
  fixed; the explanation belongs at the shared type now (with the
  history captured in its module doc).

Behavior-preserving. cargo check --all-targets --all-features clean.
smoke-a1 wire path unchanged.

Compounding-value move: every future Score that needs "apply a
typed resource against an existing cluster" consumes K8sBareTopology
instead of inventing its own Topology impl. That's the pattern v0
Harmony's design is meant to encourage.

feat(nats): NatsBasicScore — single-node NATS, no helm/PKI/ingress 7e2882425f

Harmony's existing NATS story starts at `NatsK8sScore`, which is
designed for production multi-site superclusters: TLS-fronted
gateways, cert-manager-minted certs, ingress + Route, helm chart
with gateway merge blocks, NatsAdmin secret prompts. All of that is
overhead for a local smoke or a single-site decentralized deployment
that just needs a live JetStream server.

Add `NatsBasicScore` beside it. Deliberately minimal:
  - Single replica
  - Official `nats:*-alpine` image via typed k8s_openapi Deployment
  - JetStream (-js) on by default, toggle via builder setter
  - Namespace created if missing
  - Service: ClusterIP by default, or NodePort via
    `.node_port(port)` for off-cluster clients (e.g. a libvirt VM
    connecting through the host's loadbalancer port)

Trait bounds are just `Topology + K8sclient` — no `HelmCommand`,
no `TlsRouter`, no `Nats` capability. Composes cleanly with
`K8sBareTopology` (added in the previous commit) so consumers can
`score.create_interpret().execute(&inventory, &topology)` against
any cluster `KUBECONFIG` points at.

Constructed via a small builder:

    NatsBasicScore::new("iot-nats", "iot-system")
        .node_port(4222)
        .jetstream(true)

Under the hood the interpret runs three `K8sResourceScore`s in
sequence (namespace → deployment → service). No new machinery —
just composition of existing primitives.

Deliberately NOT in scope for this Score:
  - TLS / PKI — use NatsK8sScore when you need those
  - Gateways / supercluster — use NatsSuperclusterScore
  - Auth (user/password or JWT) — add a ConfigMap mount when
    the Chapter 4 auth work lands

Tests (4, all passing): default is ClusterIP; node_port() flips
Service to NodePort with the right nodePort field; jetstream() toggle
controls the `-js` arg.

Part of the "compound framework value" mindset: every future Score
that wants a local NATS now points at this one type instead of
inventing its own yaml.

feat(iot): typed-Rust Deployment CR applier (example_iot_apply_deployment) 287ecdfb30

Replaces what would otherwise be a yaml fixture for the hands-on
demo. The CRD is already fully typed (DeploymentSpec + ScorePayload
+ PodmanV0Score + Rollout), so the applier uses those types
directly, constructs the CR via kube::Api, and either applies it
server-side or prints the JSON for `kubectl apply -f -`.

CLI:

  iot_apply_deployment \
      --namespace iot-demo \
      --name hello-world \
      --target-device iot-smoke-vm \
      --image docker.io/library/nginx:latest \
      --port 8080:80                       # apply
  iot_apply_deployment --image nginx:1.26  # upgrade (same name, new img)
  iot_apply_deployment --delete            # tear down
  iot_apply_deployment --print ...         # JSON to stdout → kubectl -f -

Uses server-side apply (PatchParams::apply().force()) so repeated
invocations patch the existing CR cleanly — the upgrade path the
demo exercises.

To expose the CRD types to an external consumer, iot-operator-v0
gains a thin `src/lib.rs` that re-exports the `crd` module. The
binary target now imports from the library (`use iot_operator_v0::crd;`)
instead of declaring its own `mod crd;` — avoids compiling the
types twice.

No change in operator runtime behavior.

Part of the ROADMAP/iot_platform/v0_1_plan.md Chapter 1 work.

feat(iot): example_iot_nats_install — single-node NATS via NatsBasicScore 18dd712f8e

Small CLI that installs a single-node NATS server into the cluster
KUBECONFIG points at, using harmony's `NatsBasicScore` composed
against `K8sBareTopology`.

This is the glue between `smoke-a4.sh` and the framework Score:

    cargo run -q -p example_iot_nats_install -- \
        --namespace iot-system \
        --name iot-nats \
        --node-port 4222

Defaults cover the demo exactly: iot-system namespace, NodePort 4222
so the libvirt VM agent can reach NATS through the k3d loadbalancer
port mapping.

No reinvented topology, no hand-rolled yaml, no helm shell-out. The
actual work (Namespace + Deployment + Service with the right
selector/ports/probes) lives inside `NatsBasicScore::Interpret` in
harmony where it can be reused by any future consumer.

Part of ROADMAP/iot_platform/v0_1_plan.md Chapter 1.

feat(iot): smoke-a4.sh — hands-on end-to-end demo harness 5e8fb429ca

Composed demo that brings up operator + in-cluster NATS + ARM (or
x86) VM agent, then either hands the full stack off to the user
with a command menu (default) or drives an apply + upgrade + delete
regression loop (`--auto`).

Phases:
  1. k3d cluster with NATS port exposed via `-p 4222:4222@loadbalancer`.
  2. NATS in-cluster via the new `example_iot_nats_install` binary
     → `NatsBasicScore` → typed k8s_openapi Namespace + Deployment +
     NodePort Service.
  3. CRD install via `iot-operator-v0 install` (Score-based, no yaml).
  4. Operator spawned host-side, connects to nats://localhost:4222.
  5. VM provisioned via `example_iot_vm_setup` (reused from smoke-a3);
     agent inside the VM connects to nats://<libvirt-gateway>:4222.
  6. Sanity: NATS pod Running, agent heartbeat
     `status.<device>` present in `agent-status` bucket.
  7a. DEFAULT: print a command menu (kubectl watch, typed Rust
      applier, ssh/console, natsbox one-liners, curl) and block on
      Ctrl-C with a cleanup trap tearing everything down.
  7b. `--auto`: apply nginx:latest, wait for container on the VM,
      curl, upgrade to nginx:1.26, assert container id CHANGED,
      curl, delete, assert container gone.

Prereqs documented at the top of the script. Handles both x86-64
(native KVM) and aarch64 (TCG emulation) via `ARCH=` env.

Design notes captured in ROADMAP/iot_platform/v0_1_plan.md. Uses
every piece landed in this branch so far: K8sBareTopology,
NatsBasicScore, the typed CR applier, the Score-based CRD install.

chore(iot): make smoke-a4.sh executable 818525824c

Previous commit landed the script without the +x bit (a chmod
between write and commit was swallowed). Fix with git
update-index --chmod=+x so the file is executable on checkout.

feat(nats): NatsBasicScore gets LoadBalancer expose mode b226bc9d29

Kubernetes NodePort Services must use a port in the apiserver's
configured nodeport range (default 30000-32767). NatsBasicScore's
first cut accepted any port via `.node_port(port)`, which was fine
for strict use of the capital-N NodePort Service type, but made
the demo's "use NATS client port 4222 directly from the host"
story awkward.

Replace the `node_port: Option<i32>` field with a proper
`NatsServiceType` enum (ClusterIP | NodePort(i32) | LoadBalancer).
Three builder methods — one per variant. LoadBalancer is the right
idiom for the demo: k3d's built-in `klipper-lb` fronts
LoadBalancer Services on their `port` (not their nodePort), so
`k3d cluster create -p 4222:4222@loadbalancer` delivers external
traffic straight to the Service's client port. No nodeport range
juggling.

Signatures:

    NatsBasicScore::new(name, namespace)   // ClusterIP default
        .node_port(30422)                   // NodePort(30422)
        .load_balancer()                    // LoadBalancer
        .jetstream(true)
        .image("docker.io/library/nats:2.10-alpine")

Tests: 5 pass. New assertion: `load_balancer()` produces a Service
with type LoadBalancer and no pinned nodePort (apiserver assigns).

Consumers:
- `example_iot_nats_install` gets a `--expose {cluster-ip | node-port
   | load-balancer}` flag (default `load-balancer` since that's what
  the demo wants). The legacy `--node-port N` flag survives as the
  NodePort port value.
- `smoke-a4.sh` asks for `--expose load-balancer`, matching its
  `-p 4222:4222@loadbalancer` k3d port mapping.

fix(iot/linux): ensure_subordinate_ids so rootless podman can pull images 1737374a93

Ubuntu 24.04 `useradd --system` does not allocate `/etc/subuid` +
`/etc/subgid` ranges. Rootless podman silently fails on image-layer
unpack:

    potentially insufficient UIDs or GIDs available in user namespace
    (requested 0:42 for /etc/gshadow): ... lchown /etc/gshadow:
    invalid argument

`smoke-a1.sh` didn't hit this because it runs the agent on the
*host* user, which has subuid/subgid populated by default. `smoke-a4.sh`
drives a podman pull inside the VM — the FIRST time we actually
exercise rootless-podman-on-a-fresh-system, and the failure surfaces
immediately.

The fix belongs in harmony, not in ad-hoc cloud-init scripts. Add
`UnixUserManager::ensure_subordinate_ids` alongside the existing
`ensure_user` + `ensure_linger` methods:

- `domain/topology/host_configuration.rs`: new trait method. Doc
  explains why every rootless-container-runtime consumer needs it.
- `modules/linux/ansible_configurator.rs`: impl follows `ensure_linger`'s
  pattern — a grep probe on /etc/subuid+/etc/subgid, then a single
  `usermod --add-subuids 100000-165535 --add-subgids 100000-165535`
  only when missing. Idempotent, no-ops on re-run.
- `modules/linux/topology.rs`: forwarder for `LinuxHostTopology`.
- `modules/iot/setup_score.rs`: call the new method right after
  `ensure_linger` in `IotDeviceSetupScore`. Any future consumer that
  runs rootless podman reaches for the same primitive.

Verified: `cargo check --all-features` clean. End-to-end smoke-a4
regression pending (re-running after this commit).

fix(iot/smoke-a4): query podman as iot-agent, not iot-admin a098e48e29

The agent runs rootless podman as the `iot-agent` user (system
user, created by IotDeviceSetupScore). Each user has their own
podman state tree under ~/.local/share/containers. The smoke
was running \`podman ps\` as \`iot-admin\` (the ssh login user),
so it saw an empty store even when the agent had happily created
the nginx container — leading to a spurious "container never
appeared" failure despite the reconciler reporting SUCCESS.

Fix: go through \`sudo su - iot-agent -c\` with
\`XDG_RUNTIME_DIR=/run/user/\$(id -u)\` so the command runs in
the right user session. Update the hand-off command menu with the
equivalent one-liner so the user can inspect the fleet's actual
container state without tripping over the same gotcha.

Smoke-a4 PASSes end-to-end on x86_64:
  - CRD apply → container materializes
  - Upgrade via new image → container id changes (not patched)
  - Delete → container removed

With the previous commit (ensure_subordinate_ids), this closes
Chapter 1 of ROADMAP/iot_platform/v0_1_plan.md: the full v0 loop
works, hands-on driven by kubectl / a typed Rust binary / natsbox.

fix(iot/smoke-a4): per-arch container-wait timeouts for TCG 9fd283183d

Initial 180 s wait assumed native-KVM x86 speed. Under aarch64 TCG
the same nginx:latest pull (~250 MB image + layered userns unpack)
takes 4-8 min observed; 180 s was catching post-heartbeat reconcile
mid-pull and reporting FAIL.

Bump `CONTAINER_WAIT_STEPS` per arch:
  - x86 KVM: 90 iterations × 2 s = 180 s (unchanged)
  - aarch64 TCG: 450 × 2 s = 900 s (15 min)

Apply to both the 'first-boot container' and 'upgrade container id
change' loops.

fix(iot/smoke-a4): sideload NATS image into k3d to dodge Docker Hub rate limits ec3d3a9d63

Docker Hub's unauthenticated rate limit (100 pulls per 6h per IP,
counted per-manifest-query) is the most reliable way for a CI-style
smoke loop to produce false negatives. The NATS pod failing with
'429 Too Many Requests' after a handful of runs today was that —
not a real regression.

Fix inside the smoke: before running the install Score, sideload the
NATS image into the k3d cluster via a podman→docker→k3d bridge:

  - If the image isn't already in docker's store:
      - If it's not in podman's store either, podman pull (this is
        the one-time hit we can't avoid).
      - podman save → docker load.
  - k3d image import into the cluster's containerd.

Steady-state this is a few-hundred-ms operation (no Hub calls, no
registry traffic). Require docker in the preflight list since we
depend on it for the cross-runtime bridge.

Also bump the Available-wait from 60 s to 120 s — the post-import
pod spin-up is fast but the scheduler + loadbalancer update take
longer than I initially budgeted.

VM-side nginx pulls are still at Hub's mercy; addressing that
requires either (a) docker login before the smoke, (b) an
authenticated registry mirror, or (c) arch-specific image
pre-seeding into the VM. All Chapter-2+ follow-ups.

feat(reconciler-contracts): enrich AgentStatus with per-deployment phase + events + inventory 7dd89a7617

Chapter 2 groundwork. The on-wire AgentStatus the agent publishes
every 30 s was only carrying device_id + status + timestamp — not
enough for the operator to answer "how are my deployments doing."
Enrich it so the operator can aggregate into a useful
DeploymentStatus.aggregate subtree on the CR (second commit).

**harmony-reconciler-contracts/src/status.rs**

- `AgentStatus.deployments: BTreeMap<String, DeploymentPhase>` —
  keyed by deployment name (CR's metadata.name). Each phase carries
  `{ phase: Running|Failed|Pending, last_event_at, last_error }`.
- `AgentStatus.recent_events: Vec<EventEntry>` — ring buffer of the
  most recent reconcile events on this device. Each entry is
  `{ at, severity: Info|Warn|Error, message, deployment: Option }`.
  Bounded agent-side to keep JetStream per-message size sane.
- `AgentStatus.inventory: Option<InventorySnapshot>` — hostname,
  arch, os, kernel, cpu_cores, memory_mb, agent_version. Published
  once on startup.
- All three new fields are `#[serde(default)]` — mixed-fleet upgrades
  don't break: an old agent's payload deserializes into the new
  struct (deployments empty, events empty, inventory None); a new
  agent's payload deserializes into an old operator just losing the
  fields.

New tests (kept forward-compat front and center):
  - `minimal_status_roundtrip` — empty maps / None
  - `enriched_status_roundtrip` — full population
  - `old_wire_format_parses_into_enriched_struct` — pre-Chapter-2
    payload must still parse (the upgrade guarantee)
  - `wire_keys_present` — literal wire-format pins for smoke greps

**iot-agent-v0**

Reconciler gains a `StatusState { deployments, recent_events }` side
map with a bounded ring buffer (`EVENT_RING_CAP = 32`). Every code
path that changes deployment state now also records phase + event:

  - `apply()`: Pending → Running on success, Failed + error event on
    failure.
  - `remove()`: drops phase, emits "deployment deleted" info event.
  - `tick()` (periodic reconcile): keeps phase at Running on noop;
    flips to Failed + event on error (deliberately no event on
    successful no-change ticks — 30 s cadence would drown the ring).

New helper `deployment_from_key(key)` unwraps `<device>.<deployment>`
into just the deployment name. `short(s)` truncates error strings to
512 chars so the payload stays well under NATS JetStream limits.

`report_status()` in main.rs now snapshots the reconciler's status
state on every heartbeat and publishes the full enriched payload
alongside a startup-captured InventorySnapshot. Inventory reads
`/proc/sys/kernel/osrelease` + `/proc/meminfo` + `std::env::consts::ARCH`
with graceful fallbacks — no new sys-info crate dep.

Verified: `cargo test -p harmony-reconciler-contracts --lib` 7/7 green
(5 new). Operator consumption of the new fields lands in the next
commit.

feat(iot-operator): aggregate agent-status into DeploymentStatus.aggregate 37e69b36cf

The operator watches the \`agent-status\` bucket, keeps a per-device
snapshot in memory, and folds it into each Deployment CR's
\`.status.aggregate\` subtree every 5 seconds. The answer to the user's
stated requirement — "CRD .status reflect-back: per-device
succeeded/failed counts + recent log lines" — now lives in the CR
itself, observable via \`kubectl get -o jsonpath\` or any UI that
speaks k8s status subresources.

**Shape (in iot/iot-operator-v0/src/crd.rs)**

  DeploymentStatus {
    observed_score_string,   // unchanged; controller change-detect
    aggregate: Option<{
      succeeded: u32,        // devices with Phase::Running
      failed: u32,           // devices with Phase::Failed
      pending: u32,          // devices with Phase::Pending or
                             // reported-but-no-phase-entry-yet
      unreported: u32,       // target devices that never heartbeated
      last_error: Option<{   // most recent failing device + short msg
        device_id, message, at
      }>,
      recent_events: Vec<{   // last-N events across the fleet, newest first
        at, severity, device_id, message, deployment
      }>,
      last_heartbeat_at,     // freshness signal for the whole fleet
    }>
  }

**New module** \`iot/iot-operator-v0/src/aggregate.rs\`

  - \`watch_status_bucket\`: subscribes to \`status.>\` on the
    agent-status bucket, maintains a \`BTreeMap<device_id, AgentStatus>\`
    in memory. Malformed payloads + malformed keys log-and-skip; the
    snapshot map is always the latest good shape.
  - \`aggregate_loop\`: 5 s ticker. Per tick: list Deployment CRs,
    clone the snapshot (no lock held across network calls), compute
    each CR's aggregate, JSON-Merge-Patch \`.status.aggregate\`. Merge
    patch composes cleanly with the controller's
    \`observedScoreString\` patch — neither clobbers the other.
  - \`compute_aggregate\` pure fn: classification logic is in one
    place, four unit tests pin its behaviour (counts + unreported,
    reported-but-no-phase-entry = pending, event filter matches
    deployment name only, status-key parser).

**Operator wiring** (\`main.rs\`)

  \`run()\` now opens *both* KV buckets at startup, spawns the
  controller and the aggregator concurrently via
  \`tokio::select!\`. Either returning an error tears the process
  down — kube-rs's Controller already absorbs transient reconcile
  errors internally, so anything escaping is genuinely fatal.

**Controller tweak**

  The apply path's \`patch_status\` was rebuilding the whole
  \`DeploymentStatus\` struct, which would clobber the aggregator's
  writes. Switched to raw JSON-Merge-Patch for the
  \`observedScoreString\` field only. Behaviour preserved, aggregate
  subtree left intact.

**Smoke assertion** (smoke-a4.sh --auto)

  After apply + curl succeeds, the --auto path now asserts
  \`kubectl get deployment.iot.nationtech.io ... -o
  jsonpath='{.status.aggregate.succeeded}'\` reaches 1 within
  60 s. Proves the full agent → status bucket → operator aggregate →
  CRD status loop, end to end.

Verified locally: \`cargo test -p iot-operator-v0 --lib\` 4/4 green,
\`cargo check --all-targets --all-features\` clean.

feat(podman): IfNotPresent pull + smoke-a4 tarball sideload for images 92f1519f8e

Two changes that compose into one win: the smoke no longer needs a
functional Docker Hub to exercise the agent → podman → container
loop.

**harmony/src/modules/podman/topology.rs — IfNotPresent for image pull**

`PodmanTopology::ensure_service_running` was calling `podman pull`
on every reconcile, even when the image was already in the local
store. For a long-lived device agent reconciling against a public
registry, that's a guaranteed rate-limit collision: Docker Hub caps
unauthenticated pulls at 100 manifests per 6 h per IP, and an agent
ticking every 30 s chews through that allowance in a day.

Change the pull path to check the local store first:

    if images.get(image).exists().await? { return Ok(()); }
    // else: pull

Matches Kubernetes' `imagePullPolicy: IfNotPresent` semantics.
Correct default for the IoT platform: upgrades change the image
STRING (tag or digest), so they still hit the pull branch —
"use local if available, pull the new thing if the reference changed."

**iot/scripts/smoke-a4.sh — tarball sideload in place of registry**

An earlier iteration of this smoke stood up a local `registry:2`
container and pushed tagged images into it. That pattern itself
needs to pull `registry:2` from Docker Hub — cute demo, still
Hub-dependent. Gone now.

New phase 4.5 / 5c pair:

  4.5: podman save the cached `nginx:alpine` under two local tags
       (`localdev/nginx:v1`, `localdev/nginx:v2`) into a tarball on
       the host.
  5c:  scp the tarball to the VM, `podman load` it into the
       iot-agent user's rootless store.

Paired with the new IfNotPresent semantics, the agent's reconcile
sees both images already present and never touches a registry. The
upgrade test still works because `v1` and `v2` are distinct tag
strings → spec drift → container id changes.

Dropped the `docker` preflight (no more k3d-side registry transfer)
and the `LOCAL_REGISTRY_*` env vars.

Verified end-to-end: x86 smoke-a4 --auto PASS.
  - apply v1 → container up → curl 200
  - .status.aggregate.succeeded = 1 (Chapter 2 aggregator working)
  - apply v2 → container id changes (upgrade confirmed)
  - delete → container removed

Aarch64 run next.

fix(iot/smoke-a4): arch-match guard on cached SRC_IMAGE 97e10927d2

Running smoke-a4 with `ARCH=aarch64` after an `ARCH=x86-64` run
rebinds the local `nginx:alpine` tag to arm64 (or vice versa),
silently breaking the other arch's next run. Fail fast if the
cached image arch doesn't match the smoke's ARCH, with the exact
command to fix it (`podman pull --platform=linux/<arch> ...`).

fix(iot/smoke-a4): single-archive save + post-load tagging on VM 087af2f6f4

`podman save -m` produces an OCI multi-image archive format that
older podman versions in the Ubuntu 24.04 cloud image cannot load:

  Error: payload does not match any of the supported image formats:
   * oci-archive: loading index: ...index.json: no such file or directory

Downgrade to the single-image docker-archive format (default for
`podman save`): save the source image once, load once in the VM,
then `podman tag` twice to expose it under `localdev/nginx:v1` and
`:v2`. Same bits on disk, two distinct tag references, so the
upgrade test still sees a container-id change when the Score
flips from v1 to v2.

fix(iot/smoke-a4): probe NATS TCP port after Available condition 633f015444

kubectl wait --for=Available reports on pod readiness, but k3d's
klipper-lb takes a few more seconds to wire the host loadbalancer
port to Service endpoints. Without this extra wait the operator
races the routing and dies with 'expected INFO, got nothing.'

feat(kvm): honor spec.disk_size_gb in overlay creation 9fb3691c3d

qemu-img create with no trailing size inherits the backing
image's virtual size. The Ubuntu cloud image ships with ~2 GiB
of root, which fills up as soon as we sideload a container
tarball in the smoke. Pass disk_size_gb through to qemu-img and
rely on cloud-initramfs-growroot (already in the base) to grow
the partition on first boot. example_iot_vm_setup defaults to
16 GiB.

style(kvm): rustfmt the overlay args vec literal 9a08978e34

docs(iot): mark Chapter 2 shipped in v0_1_plan c1dc7d56ea

Chapter 1 + Chapter 2 are both green end-to-end on x86_64 and
aarch64. Chapter 3 (helm packaging) is next. Design sketches kept
as the historical record — the running code is the source of
truth for 'how'.

style(iot-agent): silence two clippy nits in Chapter 2 code c081f2cf5e

push_str("…") → push('…'), and drop redundant .trim() before
.split_whitespace() in /proc/meminfo parsing.

feat(iot-operator): helm chart + gen-chart-crd subcommand

Run Check Script / check (pull_request) Successful in 2m44s

Details

99e661ce4d

Chapter 3 scaffolding. Chart layout mirrors the CloudNativePG
convention after reviewing the CRD-in-chart vs CRD-as-hook
tradeoff: CRDs live inside templates/ (so helm upgrade re-applies
schema changes) with helm.sh/resource-policy: keep so
helm uninstall never deletes them. Chart publication target is
hub.nationtech.io.

CRD yaml is generated at chart-release time by a new
`iot-operator-v0 gen-chart-crd` subcommand reading
Deployment::crd() — the runtime install path remains the typed
Score; only the chart deliverable uses generated yaml. Wrapped
with the helm conditional + annotations by templates/crds.yaml
via .Files.Get so the generated yaml stays pure.

Install / upgrade / uninstall-preserves-CRD validated against a
scratch k3d cluster; the operator pod naturally stays pending
because the hub.nationtech.io image hasn't been published yet.

johnride reviewed 2026-04-22 15:10:10 +00:00

johnride left a comment

The entire chart thing has to be rewritten in rust.

1. The entire chart thing has to be rewritten in rust. 2.

harmony/src/modules/linux/ansible_configurator.rs

						
				@@ -268,6 +268,38 @@ impl AnsibleHostConfigurator {

				        Ok(ChangeReport::CHANGED)

				    }

				    pub async fn ensure_subordinate_ids(

johnride commented

2026-04-22 14:29:05 +00:00

This is potentially dangerous, what if we have multiple users on the same host? This is a tricky bit. Why to we need to know in advance what subuids podman expects? It feels like something that should be deferred to runtime/pod inspection once it is up. But I am lacking context to really understand the why here.

harmony/src/modules/nats/score_nats_basic.rs

						
				@@ -0,0 +11,4 @@

				//!     caller wants off-cluster access).

				//!

				//! What it deliberately does **not** do:

				//!   - No helm. The official `nats/nats` chart is ~2k lines of yaml

johnride commented

2026-04-22 14:26:22 +00:00

I strongly disagree with that. Not using helm here is bad but not because we love helm, much more so because we're introducing another unrelated method of doing the same thing instead of improving the tooling and robustness of what we already have. I think this whole nats_score_basic should be coupled in one way or another with the other nats scores we have.

For instance create a basic one that is very flexible and low level and create multiple high level ones on top of it specialized for the various use cases we have.

I strongly disagree with that. Not using helm here is bad but not because we love helm, much more so because we're introducing another unrelated method of doing the same thing instead of improving the tooling and robustness of what we already have. I think this whole nats_score_basic should be coupled in one way or another with the other nats scores we have. For instance create a basic one that is very flexible and low level and create multiple high level ones on top of it specialized for the various use cases we have.

iot/iot-operator-v0/chart/Chart.yaml

						
				@@ -0,0 +1,17 @@

				apiVersion: v2

johnride commented

2026-04-22 15:06:41 +00:00

No yaml. Use template hydration as specified in ADR 018. https://git.nationtech.io/NationTech/harmony/src/branch/master/docs/adr/018-Template-Hydration-For-Workload-Deployment.md

iot/iot-operator-v0/chart/crd-source/deployments.iot.nationtech.io.yaml

						
				@@ -0,0 +1,141 @@

				apiVersion: apiextensions.k8s.io/v1

johnride commented

2026-04-22 15:05:12 +00:00

don't use yaml. Use rust structs and apply them directly. Even helm generation is fully hydrated, we only use helm as a packaging and versionning tool, no configuration.

iot/iot-operator-v0/chart/regen-crd.sh

						
				@@ -0,0 +1,17 @@

				#!/usr/bin/env bash

johnride commented

2026-04-22 15:07:09 +00:00

No bash script to generate yaml, that is a crime against harmony .

iot/iot-operator-v0/chart/templates/_helpers.tpl

						
				@@ -0,0 +1,58 @@

				{{/*

johnride commented

2026-04-22 15:08:02 +00:00

Avoid that. Use askama templates when we need them.

iot/iot-operator-v0/chart/templates/clusterrole.yaml

						
				@@ -0,0 +1,21 @@

				{{- if .Values.rbac.create -}}

johnride commented

2026-04-22 15:08:40 +00:00

No yaml, a clusterrole is a fully typed rust struct with kube-rs, much more robust than typo-magnet templates.

iot/iot-operator-v0/chart/values.schema.json

						
				@@ -0,0 +1,58 @@

				{

johnride commented

2026-04-22 15:09:32 +00:00

No values, we use full hydration. This will be handled by the rust binary generating a fully hydrated template, not typo magnets.

johnride reviewed 2026-04-22 15:35:36 +00:00

johnride left a comment

We have to be careful with the aggregation architecture and data model, this is what makes or breaks performance of this kind of tool at scale, which is what makes or breaks how much users love using it.

iot/iot-operator-v0/src/aggregate.rs

						
				@@ -0,0 +1,352 @@

				//! Agent-status → CR-status aggregator.

johnride commented

2026-04-22 15:27:42 +00:00

I feel like there is a scalability issue here. Computing the aggregate on the operator's side for every device does not work with millions of devices. Then again, each device has limited compute capacity. But I do think that the rpi target is powerful enough for that. So each device would be writing multiple keys for itself that the operator could scrape. The devices can update their "last_events" key by themselves, same goes for current_state and device_info, etc. I feel like all the logs could go on the wire but probably not on jetstream kv, just regular at least once nats channels. It would be great if we could buffer the last 10000 lines to access them at any time for any time. That could be a feature we implement where when we query logs for a device the device sends the last 10k lines and streams until we're done.

Then we should use something similar to what databases do when keeping an atomic counter for the number of devices in each state. The logic is simple, each device can only be in one state at a time. When transitionning from healthy to upgrading to failed the device updates its status, which automatically increments/decrements the appropriate counters. This is something the aggregator can do by watching for status change events but it still has the problem that on startup it has to read all statuses to compute the current counters.

There must be something available in our current architecture that does that well? I am sure I've seen some databases great at doing just that but can't remember which. Cassandra? PostgreSQL? Any SQL database?

I feel like there is a scalability issue here. Computing the aggregate on the operator's side for every device does not work with millions of devices. Then again, each device has limited compute capacity. But I do think that the rpi target is powerful enough for that. So each device would be writing multiple keys for itself that the operator could scrape. The devices can update their "last_events" key by themselves, same goes for current_state and device_info, etc. I feel like all the logs could go on the wire but probably not on jetstream kv, just regular at least once nats channels. It would be great if we could buffer the last 10000 lines to access them at any time for any time. That could be a feature we implement where when we query logs for a device the device sends the last 10k lines and streams until we're done. Then we should use something similar to what databases do when keeping an atomic counter for the number of devices in each state. The logic is simple, each device can only be in one state at a time. When transitionning from healthy to upgrading to failed the device updates its status, which automatically increments/decrements the appropriate counters. This is something the aggregator can do by watching for status change events but it still has the problem that on startup it has to read all statuses to compute the current counters. There must be something available in our current architecture that does that well? I am sure I've seen some databases great at doing just that but can't remember which. Cassandra? PostgreSQL? Any SQL database?

iot/scripts/smoke-a4.sh

						
				@@ -0,0 +1,524 @@

				#!/usr/bin/env bash

johnride commented

2026-04-22 15:34:39 +00:00

Looks reasonable, but would be much better as an easily runnable rust example.

johnride added 1 commit 2026-04-22 15:36:37 +00:00

feat(iot): label-selector targeting (replace target_devices with targetSelector)

Run Check Script / check (pull_request) Successful in 2m23s

Details

92150da12a

DeploymentSpec.target_devices (flat string list) is gone. In its
place, DeploymentSpec.target_selector is a minimal
LabelSelector-shaped struct (matchLabels only for now, matchExpressions
deferred until there's a real need). Devices publish a labels map
in every AgentStatus heartbeat; operator resolves the selector
against the current fleet snapshot on each reconcile + aggregator
tick.

No legacy shim — the CRD is v1alpha1 and not yet deployed in the wild.

Aggregator consequences:
  - controller and aggregator now share a StatusSnapshots map so
    selector resolution sees the same data on both sides.
  - unreported is dropped: a device that has never heartbeated is
    invisible to the selector machinery, so the field no longer
    has clean semantics. "device went dark" can come back as a
    staleness metric later if needed.
  - controller's MissingTargets error is gone: zero matches is a
    legitimate state (devices may not have joined yet). The
    controller logs and fast-requeues (15s/30s) so a just-joining
    device picks the deployment up without needing a
    cross-task subscription.

Agent + setup Score:
  - Agent config grows a [labels] section (BTreeMap); the flat
    [agent].group field is gone. group becomes just one label.
  - IotDeviceSetupConfig takes a BTreeMap<String, String> instead
    of a String group. TOML render iterates the BTreeMap (ordered)
    so idempotent change detection still works cleanly.

CLI-facing:
  - example_iot_apply_deployment: --target-device -> --to, accepts
    comma-separated key=value pairs.
  - example_iot_vm_setup: --group -> --labels, same grammar.
  - smoke-a4.sh: VM publishes group=$GROUP,device=$DEVICE_ID;
    deploys target --to device=$DEVICE_ID so single-device smoke
    behavior is preserved while exercising the selector path.

CRD regenerated via chart/regen-crd.sh. 7 contract tests + 6
operator tests pass.

johnride commented

2026-04-25 13:55:02 +00:00

Superseded by #275 and #276

johnride closed this pull request

2026-04-25 13:55:02 +00:00

Run Check Script / check (pull_request) Successful in 2m23s

Details

Pull request closed

Please reopen this pull request to perform a merge.

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: NationTech/harmony#272

				`@@ -0,0 +1,352 @@`
				`//! Agent-status → CR-status aggregator.`