Files
harmony/docs/adr/drafts/012-2-mvp-harmony-apply.md
Jean-Gabriel Gill-Couture 9a4d3915d4
All checks were successful
Run Check Script / check (pull_request) Successful in 2m20s
feat(fleet): harmony apply — deploy the published operator chart (minimal)
FleetOperatorScore gains an Option<PublishedChart> field: None renders +
installs the chart from local source (dev/e2e, unchanged); Some installs
the published oci://<registry>/<project>/harmony-fleet-operator:<version>
chart via the same HelmChartScore (the CD "harmony apply" path).
`--operator-chart-version` selects it.

Operator chart is namespace-neutral + carries an explicit chart_version,
so one published artifact installs into any namespace at a pinned tag.
Documents the decision in ADR-012-2.

Slim variant of feat/fleet-harmony-apply: a plain Option field + inline
branch instead of an OperatorChartSource enum + builder + chart_ref()
method + unit tests.
2026-05-28 19:37:08 -04:00

6.4 KiB

ADR: Continuous Delivery via harmony apply

Status

Proposal

Context

NationTech needs a continuous-delivery mechanism for tenant applications on multi-tenant Kubernetes clusters, including production OKD and edge Pico deployments. Existing Harmony tooling already builds container images and publishes hydrated Helm charts. The missing step is applying those charts to target clusters.

Constraints shaping the decision:

  • Tenants are isolated by namespace or namespace-prefix group (e.g., tenanta-*), each mapped 1:1 to a Zitadel organization.
  • Apps may span multiple clusters per tenant.
  • Tenants deploy via Harmony CLI and CI/CD, not interactively.
  • Pico deployments run on resource-constrained hardware (three M920q nodes) where per-tenant CD overhead is prohibitive.
  • Harmony is the platform product; the customer experience should be coherent and Harmony-owned, not bolted together from foreign components.
  • A tenant-facing UI is desirable but not blocking. The fleet operator dashboard will eventually fill this gap.

Decision

Extend Harmony with a harmony apply capability that performs helm install / helm upgrade on already-hydrated Helm charts, in the same spirit as the existing helm publish step.

Reconciliation, credential handling, and tenant scoping live inside Harmony. No external GitOps controller is introduced.

Alternatives Considered

  • Argo CD, instance-per-tenant via argocd-operator. Strong credential-layer isolation and a usable UI, but per-tenant resource overhead is incompatible with Pico, and it introduces a foreign component inside Harmony's product surface. Rejected.
  • Argo CD, single shared instance with AppProjects. Lower overhead, but isolation is policy-based rather than credential-based, and the operational burden of running Argo across all environments does not pay for itself given tenants don't deploy through the UI. Rejected.
  • Flux CD, single instance, SA-per-tenant. Strong credential-layer isolation, low overhead, good fit for CLI/CI-CD-driven deploys. Rejected because it still introduces a foreign component to operate, monitor, and version-manage across the fleet, and provides no tenant-facing UI either — so the only advantage over harmony apply is feature maturity, which is not worth the integration cost given the scope we actually need.
  • Argo CD hub + argocd-agent across clusters. Solves multi-cluster credential sprawl elegantly but does not solve the foreign-component or Pico-overhead problems. Rejected.

Rationale

  • Harmony already owns adjacent concerns: chart hydration, image publishing, cluster provisioning, networking, storage. Extending it to apply the charts it produces closes a natural seam.
  • helm install/upgrade against a hydrated chart is a bounded problem with a mature library (Helm SDK). The scope we need is small relative to Argo's or Flux's full feature surface.
  • Resource cost is effectively zero on top of Harmony's existing footprint. Pico becomes viable without compromise.
  • Credential isolation is handled by impersonating per-tenant ServiceAccounts when applying, identical in mechanism to Flux's model.
  • Avoids committing to a third-party CD tool whose lifecycle, support window, and upgrade cadence we would otherwise need to track across every cluster.
  • Keeps Harmony as the single product surface customers interact with.

Scope

In scope

  • harmony apply performing helm install / helm upgrade on hydrated charts produced by the existing pipeline.
  • Per-tenant ServiceAccount impersonation, with RBAC scoped to the tenant's namespace(s).
  • Multi-cluster targeting per tenant via Harmony's existing cluster registry.
  • Status reporting (success, failure, diff summary) surfaced through Harmony's existing output channels.
  • Idempotent re-apply and rollback via Helm's native release history.

Out of scope

  • A tenant-facing web UI for application status, log viewing, scaling, or restart. Deferred to the fleet operator dashboard.
  • Continuous reconciliation / drift detection. Apply is invoked explicitly by CLI or CI/CD; we are not running a controller loop.
  • Sync waves, pre/post hooks beyond what Helm natively provides, and other Argo-specific CD semantics.
  • Image-based auto-update workflows.
  • GitOps-style pull from a tenant Git repository. Tenants push through Harmony; Harmony applies.

Demo Strategy

Until the fleet operator dashboard ships, demos use the OKD web console for real-time pod status, logs, scaling, and restart visualization. harmony apply output in the terminal demonstrates the deploy itself. No production Argo instance is provisioned for demo aesthetics.

Consequences

Positive

  • Single coherent product surface; no foreign CD component to operate.
  • Works identically on production OKD and on Pico without per-environment compromise.
  • Lowest possible resource overhead.
  • Credential-layer tenant isolation via ServiceAccount impersonation.
  • Composes cleanly with existing helm publish step — same chart, same hydration, next logical action.
  • Removes external version-management and lifecycle dependencies (Argo support window, operator upgrades, agent compatibility).

Negative / Tradeoffs

  • No tenant-facing UI in the near term. Mitigated by OKD console for technical users and by the future fleet operator dashboard.
  • No continuous drift detection. Acceptable because deploys are explicit and tenants do not have direct cluster write access outside Harmony.
  • We carry the maintenance burden of our own apply logic, including error handling, status surfacing, and edge cases that Argo/Flux have already encountered. Bounded by the small feature surface we need.
  • If requirements later expand to drift detection, sync waves, or rich UI workflows, we may need to revisit. This is acceptable; the decision is reversible.

Open Questions

  • Define the precise tenant model in Harmony that generates per-tenant ServiceAccount, RBAC bindings across tenanta-* namespaces, and target-cluster credentials.
  • Decide where Helm release history lives and how rollback is exposed through the CLI.
  • Determine status-reporting format for CI/CD consumption.

Revisit Criteria

Reconsider this decision if any of the following become true:

  • Tenants demand a self-service UI before the fleet operator dashboard is available.
  • Drift detection becomes a contractual or compliance requirement.
  • Harmony's apply logic accumulates enough complexity that adopting Flux or Argo would meaningfully reduce maintenance burden.