Files
harmony/ROADMAP/fleet_platform/dashboard_ingress.md
Jean-Gabriel Gill-Couture cc41f190d2
Some checks failed
Run Check Script / check (pull_request) Failing after 59s
refactor: chart is now namespace-neutral; add dashboard roadmap
The k3d smoke-test surfaced that the operator chart baked
`fleet-system` into every namespaced manifest (Deployment,
ServiceAccount, Secret) and into the ClusterRoleBinding subject.
Installing into any other namespace failed with helm
release-namespace mismatch.

Fixed by making the chart genuinely namespace-neutral:

- Removed `namespace` from `ChartOptions` entirely.
- `service_account()` and `operator_deployment(opts)` no longer
  set `metadata.namespace`; helm assigns the release namespace at
  install time, and the direct-apply path injects the namespace
  through `K8sResourceScore::single(.., Some(ns))`.
- `operator_secret(opts)` likewise drops `metadata.namespace`; the
  Secret is applied with an explicit namespace by its caller.
- `cluster_role_binding(subject_namespace)` keeps a namespace
  argument because the CRB subject must point at a concrete
  namespace; the chart path passes the literal helm template
  `{{ .Release.Namespace }}` so helm substitutes the release
  namespace at install time. The direct-apply path passes the
  real namespace string.
- `FleetOperatorScore::new()` defaults its own `namespace` field
  (no longer sourced from `ChartOptions::default()`); the chart
  itself carries no namespace default at all.

Verified on k3d by installing the released chart into a
deliberately non-default namespace (`my-fleet`): all resources
land in `my-fleet`, ClusterRoleBinding subject resolves to
`my-fleet`, operator pod runs.

Also adds `ROADMAP/fleet_platform/dashboard_ingress.md` capturing
the three-step dependency chain (build with web-frontend feature →
implement real FleetService → add Service + Ingress to chart) that
the k3d test surfaced when looking for the dashboard. Unnumbered
file per project convention; numbered ones are versioned
milestones.
2026-05-26 14:58:50 -04:00

93 lines
4.4 KiB
Markdown

# Fleet operator dashboard — make shippable and expose via Ingress
## Context
The operator binary has a server-side dashboard (axum + Maud + HTMX
under `fleet/harmony-fleet-operator/src/frontend/`), but it is **not
shippable today**. The k3d smoke-test of the release pipeline made
this concrete: the chart correctly omits any `Service` or `Ingress`
because there is no production-ready dashboard endpoint to point them
at. Three blockers, in order of dependency.
## Work to be done
### 1. Build the production image with the dashboard included
- [ ] Update `fleet/harmony-fleet-operator/Dockerfile` to build with
`--features web-frontend` (currently
`cargo build --release --locked -p harmony-fleet-operator`,
no features).
- [ ] Confirm Tailwind CSS is embedded at build time inside the
builder stage. The crate doc says the CSS is embedded when
`tailwindcss` is on PATH at build time, otherwise the bundle is
empty and `--css-from` must be passed at runtime. Decide: ship
with embedded CSS (install `tailwindcss` in the builder stage)
or document the empty-bundle path.
- [ ] Confirm the build still satisfies the cross-compile gating
added in PR #291 (`ci: fix Windows cross-compile by gating
unix-only harmony code`) — the `web-frontend` feature must not
pull in unix-only code on Windows targets if Windows is still a
CI target.
### 2. Replace the mock-only `serve-web` with a real implementation
- [ ] Implement `FleetService` against the real NATS + Kubernetes
backend (the operator currently uses
`MockFleetService::default()` and bails when `--mock` is
not passed: `main.rs:125``"serve-web without --mock is not
implemented yet (real FleetService impl pending)"`).
- [ ] Decide the runtime topology: does the controller and the web
server share a Pod and a process? Two containers in one Pod?
Two separate Deployments? Current code suggests "same process,
different subcommand"; the chart will need to be updated
whichever way it goes.
- [ ] Wire the Zitadel auth env vars (`FLEET_AUTH_*` from `dev.sh`)
through the chart's Pod env. These are
operator-environment-specific (like the existing
`FLEET_OPERATOR_CREDENTIALS_TOML` Secret) and should likely
stay out of the redistributable chart, mounted by the deploy
pipeline.
- [ ] Decide on the `FLEET_OPERATOR_COOKIE_KEY_B64` lifecycle:
operator-generated on first boot? Deploy-time secret? Document.
### 3. Expose the dashboard via Service + Ingress in the chart
- [ ] Add a `Service` resource to `chart.rs` (ClusterIP, target port
18080 to match the default `serve-web --addr`).
- [ ] Add an `Ingress` resource. Open questions:
- Ingress class: assume `traefik` (k3d default)? Make it
configurable via `ChartOptions`?
- Host: configurable via `ChartOptions` (e.g.,
`fleet.my-cluster.example.com`); no sensible default.
- TLS: cert-manager `ClusterIssuer` reference, or expect TLS to be
terminated upstream? Probably a `ChartOptions.tls_issuer:
Option<String>` knob — `None` means "no TLS section on the
Ingress."
- [ ] Decide whether the Ingress is in scope for the chart at all,
or whether it should live in a separate `*-ingress` chart that
the deploy layer composes. The first path is simpler;
the second matches "small composable Scores" from ADR-023.
- [ ] Smoke-test on k3d: install the chart, `curl` the dashboard
through the k3d LoadBalancer, confirm HTTP 200 and the page
renders.
## Out of scope here
- Decisions about who hosts the dashboard's auth (Zitadel-only or
multi-IdP) — that's a product question, not a chart question.
- Operator HA. The current chart is `replicas: 1`. Multi-replica
needs leader election in the controller, which is its own work.
- Dashboard observability (metrics endpoint, structured access
logs) — fold in when adding the Service.
## Why this lives in its own roadmap
These three items are dependency-chained (1 → 2 → 3) and each is
non-trivial. Bundling them with the CI release pipeline would couple
unrelated risks and make the PR un-reviewable. Keeping this file
unnumbered (per
[`ROADMAP/fleet_platform/v0_1_plan.md`](v0_1_plan.md) and
[`v0_2_plan.md`](v0_2_plan.md) — numbered files are versioned
milestones) signals that this is a free-floating workstream that
slots into whichever milestone picks it up.