The k3d smoke-test surfaced that the operator chart baked
`fleet-system` into every namespaced manifest (Deployment,
ServiceAccount, Secret) and into the ClusterRoleBinding subject.
Installing into any other namespace failed with helm
release-namespace mismatch.
Fixed by making the chart genuinely namespace-neutral:
- Removed `namespace` from `ChartOptions` entirely.
- `service_account()` and `operator_deployment(opts)` no longer
set `metadata.namespace`; helm assigns the release namespace at
install time, and the direct-apply path injects the namespace
through `K8sResourceScore::single(.., Some(ns))`.
- `operator_secret(opts)` likewise drops `metadata.namespace`; the
Secret is applied with an explicit namespace by its caller.
- `cluster_role_binding(subject_namespace)` keeps a namespace
argument because the CRB subject must point at a concrete
namespace; the chart path passes the literal helm template
`{{ .Release.Namespace }}` so helm substitutes the release
namespace at install time. The direct-apply path passes the
real namespace string.
- `FleetOperatorScore::new()` defaults its own `namespace` field
(no longer sourced from `ChartOptions::default()`); the chart
itself carries no namespace default at all.
Verified on k3d by installing the released chart into a
deliberately non-default namespace (`my-fleet`): all resources
land in `my-fleet`, ClusterRoleBinding subject resolves to
`my-fleet`, operator pod runs.
Also adds `ROADMAP/fleet_platform/dashboard_ingress.md` capturing
the three-step dependency chain (build with web-frontend feature →
implement real FleetService → add Service + Ingress to chart) that
the k3d test surfaced when looking for the dashboard. Unnumbered
file per project convention; numbered ones are versioned
milestones.
4.4 KiB
Fleet operator dashboard — make shippable and expose via Ingress
Context
The operator binary has a server-side dashboard (axum + Maud + HTMX
under fleet/harmony-fleet-operator/src/frontend/), but it is not
shippable today. The k3d smoke-test of the release pipeline made
this concrete: the chart correctly omits any Service or Ingress
because there is no production-ready dashboard endpoint to point them
at. Three blockers, in order of dependency.
Work to be done
1. Build the production image with the dashboard included
- Update
fleet/harmony-fleet-operator/Dockerfileto build with--features web-frontend(currentlycargo build --release --locked -p harmony-fleet-operator, no features). - Confirm Tailwind CSS is embedded at build time inside the
builder stage. The crate doc says the CSS is embedded when
tailwindcssis on PATH at build time, otherwise the bundle is empty and--css-frommust be passed at runtime. Decide: ship with embedded CSS (installtailwindcssin the builder stage) or document the empty-bundle path. - Confirm the build still satisfies the cross-compile gating
added in PR #291 (
ci: fix Windows cross-compile by gating unix-only harmony code) — theweb-frontendfeature must not pull in unix-only code on Windows targets if Windows is still a CI target.
2. Replace the mock-only serve-web with a real implementation
- Implement
FleetServiceagainst the real NATS + Kubernetes backend (the operator currently usesMockFleetService::default()and bails when--mockis not passed:main.rs:125—"serve-web without --mock is not implemented yet (real FleetService impl pending)"). - Decide the runtime topology: does the controller and the web server share a Pod and a process? Two containers in one Pod? Two separate Deployments? Current code suggests "same process, different subcommand"; the chart will need to be updated whichever way it goes.
- Wire the Zitadel auth env vars (
FLEET_AUTH_*fromdev.sh) through the chart's Pod env. These are operator-environment-specific (like the existingFLEET_OPERATOR_CREDENTIALS_TOMLSecret) and should likely stay out of the redistributable chart, mounted by the deploy pipeline. - Decide on the
FLEET_OPERATOR_COOKIE_KEY_B64lifecycle: operator-generated on first boot? Deploy-time secret? Document.
3. Expose the dashboard via Service + Ingress in the chart
- Add a
Serviceresource tochart.rs(ClusterIP, target port 18080 to match the defaultserve-web --addr). - Add an
Ingressresource. Open questions:- Ingress class: assume
traefik(k3d default)? Make it configurable viaChartOptions? - Host: configurable via
ChartOptions(e.g.,fleet.my-cluster.example.com); no sensible default. - TLS: cert-manager
ClusterIssuerreference, or expect TLS to be terminated upstream? Probably aChartOptions.tls_issuer: Option<String>knob —Nonemeans "no TLS section on the Ingress."
- Ingress class: assume
- Decide whether the Ingress is in scope for the chart at all,
or whether it should live in a separate
*-ingresschart that the deploy layer composes. The first path is simpler; the second matches "small composable Scores" from ADR-023. - Smoke-test on k3d: install the chart,
curlthe dashboard through the k3d LoadBalancer, confirm HTTP 200 and the page renders.
Out of scope here
- Decisions about who hosts the dashboard's auth (Zitadel-only or multi-IdP) — that's a product question, not a chart question.
- Operator HA. The current chart is
replicas: 1. Multi-replica needs leader election in the controller, which is its own work. - Dashboard observability (metrics endpoint, structured access logs) — fold in when adding the Service.
Why this lives in its own roadmap
These three items are dependency-chained (1 → 2 → 3) and each is
non-trivial. Bundling them with the CI release pipeline would couple
unrelated risks and make the PR un-reviewable. Keeping this file
unnumbered (per
ROADMAP/fleet_platform/v0_1_plan.md and
v0_2_plan.md — numbered files are versioned
milestones) signals that this is a free-floating workstream that
slots into whichever milestone picks it up.