ci/fleet-argo-cd #301

Closed
johnride wants to merge 11 commits from ci/fleet-argo-cd into master

11 Commits

Author SHA1 Message Date
1349739ed7 docs(adr): draft 012-1 — release architecture mechanism
All checks were successful
Run Check Script / check (pull_request) Successful in 2m22s
Clarification + concrete mechanism for ADR-012 (Project Delivery
Automation). Pulls together two prior attempts (modules/application
RustWebapp + the per-component harmony-fleet-release binary) and
proposes a unified shape: release is a Score driven by Topology
capabilities (ContainerBuilder, OciRegistry, HelmRegistry,
ContinuousDelivery), composed alongside DeployScore /
MonitoringScore into the opinionated pipeline ADR-012 specified.

Kept in drafts as 012-1 rather than a fresh ADR-025 — this
implements ADR-012's intent, doesn't compete with it. Open
questions deliberately left open for further 012-N follow-ups.
2026-05-27 22:25:38 -04:00
80b3cf1c31 ci(fleet-operator): extract version resolution to a shared bash script
All checks were successful
Run Check Script / check (pull_request) Successful in 2m22s
Workflow yaml had a 12-line inline bash block computing the release
version from `GITHUB_REF_NAME` or `inputs.version`. Moves the
logic into `.gitea/scripts/resolve-release-version.sh` so the
workflow yaml is back to one-line invocations and the resolver is
reusable by sibling component workflows (agent, callout).

This is the interim form. The real fix is a harmony Rust binary
that understands git refs directly — see PR thread on framework-
level build/package/release ownership.
2026-05-27 21:38:05 -04:00
f4fd5d312a docs(agents): make engineering guidelines mandatory reading
All checks were successful
Run Check Script / check (pull_request) Successful in 2m26s
Adds `.agents/skills/guidelines.md` (Karpathy-style behavioural
rules — Think Before Coding, Simplicity First, Surgical Changes,
Goal-Driven Execution) and links it from CLAUDE.md / AGENTS.md as
required reading before any code change. Encodes the bar past
reviews flagged — no wrapper modules over existing ones, no
per-component CLI scaffolding, no bash inlined in CI yaml, low
comment density.
2026-05-27 21:22:37 -04:00
0992183438 feat(fleet): Argo-based CD path for the fleet operator
Adds an `operator_application()` helper that builds an
`ArgoApplication` targeting
`oci://<registry>/<project>/harmony-fleet-operator:<chart_version>`.
main.rs composes it into `ArgoHelmScore { argo_apps: vec![...] }` —
no new score types, no wrapper module.

CLI gains `--use-argo` to swap the operator path from direct-helm
to Argo, plus `--operator-chart-version` as the CD knob CI re-runs
to deploy / upgrade / roll back.

Splits `ArgoApplication.destination_namespace` from
`metadata.namespace` so the Application CR can live in `argocd`
while syncing chart resources into `fleet-system` — `to_yaml`
previously collapsed them, breaking any cross-namespace deploy.

NATS + agent stay on the direct path in v1; flattens
`harmony_cli::Args` into the binary's CliConfig so `--yes` makes it
through one argv parse.

Tested on a fresh k3d cluster against the published
`hub.nationtech.io/harmony/harmony-fleet-operator:0.0.1`.
2026-05-27 21:22:30 -04:00
a7c5318060 refactor(argocd): parameterize ArgoHelmScore for non-OKD clusters
Add `ingress_class_name: Option<String>` so callers can pin the class
(OKD `openshift-default`, vanilla `nginx`/`traefik`) or fall through
to the cluster's default IngressClass.

Gate the global + redis `runAsUser: null` on the `openshift` flag —
on vanilla k8s the chart's own `runAsNonRoot: true` combined with no
UID makes redis CrashLoop with "image will run as root". Single
pre-existing caller in packaging_deployment.rs preserved on the OKD
path.
2026-05-27 21:22:19 -04:00
81bf8a9257 refactor: app-scoped release binary harmony-fleet-release
All checks were successful
Run Check Script / check (pull_request) Successful in 2m41s
Release harmony-fleet-operator (image + chart) / release (push) Successful in 4m38s
Renames `harmony-fleet-operator-release` to `harmony-fleet-release`
and adds a `--component <operator|agent|nats-callout>` flag. The
binary's surface is now app-scoped (one binary per app, all fleet
components behind a flag) rather than component-scoped (one binary
per component), matching ADR-023's "deploy binary statically lists
its supported components" guidance.

Why: adding the agent and nats-callout release pipelines later
would otherwise mean three near-identical binaries with copy-pasted
docker/helm orchestration. Folding them under one binary keeps the
shared 90% in one place and reduces each new component to:

  - a new `Component` enum variant
  - a `Component::spec` arm naming the Dockerfile + image
  - a `hydrate_chart` arm pointing at the component's `build_chart`

`agent` and `nats-callout` variants exist today but bail with an
actionable error pointing at the roadmap; this keeps `--help`
honest about what's coming without lying about what works.

The per-component `release.sh` wrapper pattern stays: each
component's script (today `fleet/harmony-fleet-operator/release.sh`,
tomorrow agent's and callout's) is a 1-line wrapper that pre-fills
`--component`. This lets a tag like `harmony-fleet-operator-v0.1.0`
route to the right component via the existing CI workflow without
the operator having to remember a flag.

File renamed via `git mv` so blame history is preserved. Verified
on k3d with `--component operator --no-push`: produces the same
image + chart pair as before.
2026-05-26 16:21:50 -04:00
b29d466240 refactor: use tracing instead of eprintln in operator-release
All checks were successful
Run Check Script / check (pull_request) Successful in 2m26s
eprintln! was the wrong tool for a long-running CLI invoked from
shell scripts and CI. The rest of the codebase (operator,
fleet-e2e harness, etc.) uses tracing::info!, and a release tool
should honor RUST_LOG.

Initializes a stderr tracing subscriber defaulting to info so the
progress lines show up without configuration, while still letting
operators silence (`RUST_LOG=warn`) or expand (`RUST_LOG=debug`)
the output. The final summary uses structured fields (image=,
chart=) so downstream tools can grep for them deterministically.
2026-05-26 15:39:42 -04:00
87e142c73d style: cargo fmt the operator-release binary
All checks were successful
Run Check Script / check (pull_request) Successful in 2m23s
Collapses a chained `current_dir()?.join(..)` per rustfmt's
preferred layout. Caught by ./build/check.sh's fmt step; no
behavior change.
2026-05-26 15:16:14 -04:00
cc41f190d2 refactor: chart is now namespace-neutral; add dashboard roadmap
Some checks failed
Run Check Script / check (pull_request) Failing after 59s
The k3d smoke-test surfaced that the operator chart baked
`fleet-system` into every namespaced manifest (Deployment,
ServiceAccount, Secret) and into the ClusterRoleBinding subject.
Installing into any other namespace failed with helm
release-namespace mismatch.

Fixed by making the chart genuinely namespace-neutral:

- Removed `namespace` from `ChartOptions` entirely.
- `service_account()` and `operator_deployment(opts)` no longer
  set `metadata.namespace`; helm assigns the release namespace at
  install time, and the direct-apply path injects the namespace
  through `K8sResourceScore::single(.., Some(ns))`.
- `operator_secret(opts)` likewise drops `metadata.namespace`; the
  Secret is applied with an explicit namespace by its caller.
- `cluster_role_binding(subject_namespace)` keeps a namespace
  argument because the CRB subject must point at a concrete
  namespace; the chart path passes the literal helm template
  `{{ .Release.Namespace }}` so helm substitutes the release
  namespace at install time. The direct-apply path passes the
  real namespace string.
- `FleetOperatorScore::new()` defaults its own `namespace` field
  (no longer sourced from `ChartOptions::default()`); the chart
  itself carries no namespace default at all.

Verified on k3d by installing the released chart into a
deliberately non-default namespace (`my-fleet`): all resources
land in `my-fleet`, ClusterRoleBinding subject resolves to
`my-fleet`, operator pod runs.

Also adds `ROADMAP/fleet_platform/dashboard_ingress.md` capturing
the three-step dependency chain (build with web-frontend feature →
implement real FleetService → add Service + Ingress to chart) that
the k3d test surfaced when looking for the dashboard. Unnumbered
file per project convention; numbered ones are versioned
milestones.
2026-05-26 14:58:50 -04:00
7c1ef14429 fix: chart version + add --no-push for local smoke-tests
Two findings from the k3d smoke-test of the release pipeline:

1. `HelmChart::new(name, version)` is misleading — the second arg sets
   appVersion only, while the chart-level `version` is a separate pub
   field defaulting to "0.1.0". The first run produced
   `harmony-fleet-operator-0.1.0.tgz` instead of
   `harmony-fleet-operator-<tag>.tgz`. Set both fields to the released
   tag so one tag → one image + one chart at one matching version.

2. Add `--no-push` to the release binary so the same code path that
   CI exercises is usable locally for k3d smoke-tests without pushing
   to Harbor. The packaged chart tgz is copied into the caller's CWD
   so it survives the binary's tempdir cleanup, and the binary prints
   both artifact paths at the end.

Verified end-to-end on k3d: helm install brings up CRDs, RBAC, and
the operator Deployment; the operator pod reaches Running 1/1 and
starts retrying NATS connect (expected — no NATS deployed in this
smoke-test).
2026-05-26 14:15:43 -04:00
7ab5fd7041 ci: per-tag release pipeline for harmony-fleet-operator (image + chart)
One git tag `harmony-fleet-operator-v*` now produces both the
container image and a hydrated helm chart at the same version,
pushed to hub.nationtech.io. release.sh is a 5-line wrapper around a
new `harmony-fleet-operator-release` binary in harmony-fleet-deploy
that orchestrates docker build/push, chart hydration via the
existing `build_chart()`, and `helm package`/`helm push`. CI is
reduced to a thin trigger calling the same script developers run
locally.

- chart.rs: ChartOptions gains an optional chart_version (None
  preserves the previous CARGO_PKG_VERSION behavior).
- operator_release.rs: new binary.
- release.sh: thin wrapper.
- .gitea/workflows/harmony-fleet-operator.yaml: rewritten to fire
  on `harmony-fleet-operator-v*` tags (and workflow_dispatch with a
  manual version input).
2026-05-26 13:59:29 -04:00