feat/deploy_fleet_server_side #282

Closed
johnride wants to merge 0 commits from feat/deploy_fleet_server_side into feat/iot-walking-skeleton
Owner
No description provided.
johnride added 11 commits 2026-05-05 14:05:16 +00:00
The operator Dockerfile previously copied a host-built binary into
archlinux:base — archlinux was a glibc-ABI workaround for that
host-build path. Convert to a two-stage build (rust:1.94-slim →
debian:bookworm-slim) so cargo runs inside the image. load-test.sh
loses its host cargo build + staging-context trick and now points
podman at the workspace root with -f. Add build_docker.sh as the
local Harbor entry point (DOCKER_TAG, PUSH overrides).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors .gitea/workflows/harmony_composer.yaml: on push to master (or
manual dispatch), build the multi-stage Dockerfile and push
hub.nationtech.io/harmony/harmony-fleet-operator:latest. No buildx
caching yet — TODO comment in the workflow tracks it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure rustfmt wrapping on long lines that pre-dated this branch — surfaced
when running `cargo fmt --check` as part of unrelated work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapses the load-test harness's chart-gen + helm-install dance into
first-class Harmony Scores. Customer-facing path:

  let score = FleetServerScore::new(nats, operator);
  score.create_interpret().execute(&Inventory::empty(), &topology).await?;

FleetOperatorScore renders the operator chart (CRDs + RBAC + ServiceAccount
+ Deployment) into a tempdir and delegates to HelmChartScore. FleetServerScore
composes it with NatsBasicScore via fail-fast `?` chaining; Zitadel + Argo
hang off the same chain when their Scores land.

Structural change: CRD type definitions and chart-builder moved from
fleet/harmony-fleet-operator/src/{crd,chart}.rs into
harmony/src/modules/fleet/operator/. Harmony can't depend on the operator
crate (cycle), so the score-side code lives in harmony and the operator
binary imports the types right back via
`harmony::modules::fleet::operator::*`. Considered keeping CRDs in the
operator crate with the score either there or in a sibling crate, but
putting customer-facing scores in harmony/src/modules/fleet/ matches the
existing convention (FleetDeviceSetupScore, ProvisionVmScore) and keeps
the CRDs reachable from future harmony scores (e.g. an inventory aggregator
reading Device CRs) without dragging in the operator binary.

The operator's `chart` subcommand stays as a developer convenience
(routes through harmony::modules::fleet::operator::build_chart) so
`cargo run -p harmony-fleet-operator -- chart` still produces an
identical chart on disk for inspection. Existing examples
(fleet_load_test, harmony_apply_deployment) updated to import CRD types
from harmony directly.

load-test.sh phase 3c collapses to a single
`cargo run -p example_fleet_server_install` invocation; phase 2b's NATS
install still runs separately so the host-side NATS reachability probe
sits where it always did. Idempotency: re-running short-circuits via
HelmChartScore::find_installed_release on both inner installs.

Verified: cargo fmt --check, cargo clippy, cargo test all pass; the
4 fleet operator unit tests (2 migrated from operator crate, 2 new on
FleetOperatorScore defaults/builders) pass under `cargo test -p harmony`;
operator chart subcommand produces an identical chart structure
post-refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two scripts for running the new install Score against a local cluster:

- examples/fleet_server_install/run.sh — generic, cwd-independent
  passthrough around `cargo run -p example_fleet_server_install`.
- fleet/scripts/run_server_install.sh — opinionated k3d test harness:
  creates `fleet-server-test` cluster if absent (with NATS port 4222
  mapped through klipper-lb), builds the operator image via
  build_docker.sh, sideloads it, runs the Score, and leaves the
  cluster up. Prints teardown + redeploy commands at the end. Header
  documents the helm-idempotency limitation: a rebuilt image won't
  redeploy on a second run unless `helm uninstall` is invoked first
  (HelmChartScore short-circuits on chart_version match). Proper fix
  is deferred — content-hash chart_version or a force_upgrade flag.

Dockerfile glibc pin: builder pinned to `rust:1.94-slim-bookworm`.
Unsuffixed `rust:slim` follows Debian's latest stable (trixie =
glibc 2.40), so binaries built there fail to start on the
`debian:bookworm-slim` runtime (glibc 2.36) with "GLIBC_2.39 not
found". Surfaced when running the new scripts end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch example_fleet_server_install from a manual `create_interpret().
execute()` + `println!` to `harmony_cli::run`, which wires up the
framework's standard logger + reporter — emoji-tagged per-Score
progress lines and an end-of-run summary listing each Score's
`Outcome.details`. Mirrors the okd_add_node example's pattern.

For events to fire on the inner Scores, FleetServerScore now calls
`Score::interpret` (not `create_interpret().execute`) on
NatsBasicScore + FleetOperatorScore. Same change inside
FleetOperatorScore for its inner HelmChartScore.

Outcome.details populated:
- FleetOperatorScore: image, namespace, release_name, NATS URL.
- FleetServerScore: in-cluster NATS URL, kubectl pointer to the
  operator deployment, kubectl tip for verifying CRDs.

Progress logs added inside FleetOperatorScore between the chart-
render and helm-install phases (`info!`).

FleetOperatorScore fields are now `pub` so callers can read them
post-construction (FleetServerScore needs `operator.namespace` for
its summary). Builder methods unchanged; both styles coexist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skips cluster create + operator image build + k3d sideload when set —
just refreshes the kubeconfig and runs the Score against the already-
bootstrapped cluster. Shaves the slow rebuild + sideload off the dev
loop when iterating on Score-side code with the operator binary
unchanged.

Errors out cleanly if --score-only is passed but the cluster is
missing (instead of letting cargo trip on a missing kube context).
Unknown flags also fail-fast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FleetServerScore gains `pub identity: Option<ZitadelScore>` and a
conditional `.interpret()` call after the operator install. Trait
bounds widen from `Topology + HelmCommand` to
`Topology + HelmCommand + K8sclient + PostgreSQL` to satisfy the
ZitadelScore impl — both inner Scores need the wider topology even
when identity is None (Rust trait bounds are static).

Example crate consequences:
- Switched topology from K8sBareTopology to K8sAnywhereTopology
  (provides PostgreSQL via CNPG). `ensure_ready` now installs
  cert-manager as a side effect — Zitadel's prod ingress needs it
  anyway, and it's harmless on k3d.
- New CLI flags: --zitadel-host (Option<String>; omitted = no Zitadel),
  --zitadel-version, --zitadel-insecure. Dev-friendly defaults: hosts
  ending in .localhost / .test default to external_secure=false.
- Outcome details now include the Zitadel URL when identity is set.

Auxiliary:
- Added env.sh next to the example, mirroring okd_add_node's pattern
  (KUBECONFIG / RUST_LOG / sqlite secret store paths, with optional
  ZITADEL_HOST documented).
- run_server_install.sh now reads ZITADEL_HOST / ZITADEL_VERSION env
  and passes them through. Trailing banner conditionally prints the
  Zitadel `helm uninstall` command alongside the operator one.

Out of scope: load-test.sh drives the same example crate and may
need a topology audit after this change. Flagged for follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flip the polarity of the Zitadel knobs in run_server_install.sh: the
Score is now installed on every run, and `NO_ZITADEL=1` is the
explicit skip. Defaults: ZITADEL_HOST=zitadel.localhost (HTTP ingress
auto-selected by the example crate's `.localhost` rule). ZITADEL_VERSION
stays optional (empty = inherit the example's clap default).

Updates env.sh to document the new polarity (NO_ZITADEL as the opt-out,
ZITADEL_HOST/VERSION as overrides on top of the defaults).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
run_server_install.sh now unconditionally sources
examples/fleet_server_install/env.sh after computing REPO_ROOT, so
the example's env knobs (KUBECONFIG, RUST_LOG, NO_ZITADEL,
ZITADEL_HOST, …) are picked up without the user having to source
manually before invoking the script. The script's `${VAR:-default}`
block only fills in values env.sh leaves unset.

env.sh keeps a (commented-out) KUBECONFIG hint and the new optional
Zitadel knobs documented post-source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: add deploy-apache.sh example script
All checks were successful
Run Check Script / check (pull_request) Successful in 2m52s
e5caaba1e4
johnride reviewed 2026-05-05 14:15:36 +00:00
@@ -0,0 +16,4 @@
//! operator Deployment into a helm chart the cluster runs itself.
use std::collections::BTreeMap;
use std::path::{Path, PathBuf};
Author
Owner

This is wrong, the CRD is a core data structure of the operator, it is not harmony's responsibility to manage data structures internal to the oprator.

Which brings to light a refactoring that was becoming clear now : all the modules we have should be in external crates. We could have a harmony-modules crate of officially supported modules. But in the case of the operator, the CRD would not even belong there, it belongs inside the operator crate itself. Then it can be easily passed to harmony as a k8sresourcescore or similar.

This is wrong, the CRD is a core data structure of the operator, it is not harmony's responsibility to manage data structures internal to the oprator. Which brings to light a refactoring that was becoming clear now : all the modules we have should be in external crates. We could have a harmony-modules crate of officially supported modules. But in the case of the operator, the CRD would not even belong there, it belongs inside the operator crate itself. Then it can be easily passed to harmony as a k8sresourcescore or similar.
johnride added 1 commit 2026-05-05 14:32:58 +00:00
Merge branch 'feat/iot-walking-skeleton' into feat/deploy_fleet_server_side
Some checks failed
Run Check Script / check (pull_request) Failing after 59s
22eed9b533
johnride closed this pull request 2026-05-05 14:42:50 +00:00
Some checks failed
Run Check Script / check (pull_request) Failing after 59s

Pull request closed

Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: NationTech/harmony#282
No description provided.