NationTech/harmony

Fork 2

Files

Jean-Gabriel Gill-Couture 89e5e104dc

Run Check Script / check (push) Successful in 2m14s

Details

Compile and package harmony_composer / package_harmony_composer (push) Successful in 8m22s

Details

harmony-fleet-operator — release / release (push) Successful in 3m17s

Details

feat(fleet): unify deploy config, switch CLI to tracing, fix OCI chart name collision

fleet-deploy:
- Rename harmony-fleet-release binary to harmony-fleet-publish
- Route all deploy settings through ConfigClient (env → OpenBao → prompt)
  instead of bespoke flags; seed FleetDeploySecrets via OpenBao
- Rename HARMONY_SECRET_NAMESPACE to HARMONY_CONFIG_NAMESPACE
- Append -chart to the Helm chart artifact name so it no longer collides
  with the Docker image in Harbor (application/vnd.cncf.helm.config.v1+json)

harmony_cli:
- Switch from log to tracing for structured output
- Defer topology prep so --list and declined runs are no-ops
- Drop ANSI colour codes around log emojis
- Init cli logger in fleet deploy binary

openbao:
- Scope unseal-keys cache file per instance
- Example gains setup capability and updated README

roadmap:
- Add unified CLI design document (ROADMAP/13-unified-cli.md)
- Update v0.3 fleet platform plan

Squashed commit of the following:

commit 36d9d9aaec
Merge: 12c8d9cf e7148aa8
Author: johnride <jg@nationtech.io>
Date:   Mon Jun 1 15:42:56 2026 +0000

    Merge pull request 'fix: fleet operator chart name was conflicting with the container name. Append -chart to the chart name' (#317) from fix/fleet-operator-chart-name into chore/rename-release-to-publish

    Reviewed-on: #317

commit e7148aa85f
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Mon Jun 1 11:35:15 2026 -0400

    fix: fleet operator chart name was conflicting with the container name. Append -chart to the chart name

commit 12c8d9cfa0
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Mon Jun 1 11:12:23 2026 -0400

    feat: Init cli logger in fleet deploy

commit edb62668b6
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sun May 31 12:56:36 2026 -0400

    doc: Roadmap entry for cli design and implementation

commit f2ecccb4ab
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sun May 31 12:32:19 2026 -0400

    refactor(fleet-deploy): rename harmony-fleet-release to harmony-fleet-publish

    Deploy/publish wording is more intuitive than deploy/release.

commit 2e9052b217
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sun May 31 10:12:54 2026 -0400

    fix(openbao): remove extra blank line in example

    Pre-existing formatting issue caught by cargo fmt --check.

commit f7299ebe2b
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sun May 31 09:13:39 2026 -0400

    refactor(fleet-deploy): rename HARMONY_SECRET_NAMESPACE to HARMONY_CONFIG_NAMESPACE

    The env var name was a misnomer — ConfigClient resolves both config and
    secrets, not just secrets. The struct field was already config_namespace.
    Legacy SecretManager keeps the old var; this forces migration to
    ConfigClient for new code.

commit d39aa15152
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sun May 31 09:06:20 2026 -0400

    feat: fleet deploy uses configuration from configclient for all settings, update the 0_3 plan

commit 57d056fced
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sat May 30 11:07:03 2026 -0400

    fix(openbao): scope unseal-keys cache file per instance

    The root token + unseal keys were written to a single fixed
    `~/.local/share/harmony/openbao/unseal-keys.json`, so deploying a second
    OpenBao instance (different namespace/release) overwrote the first's keys —
    after which the first could never be unsealed. Key the file by
    namespace+release (`unseal-keys-<ns>-<release>.json`); `cached_root_token`
    now takes the `OpenbaoInstance` to read the right one.

commit 44aa83199a
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sat May 30 11:05:30 2026 -0400

    fix(harmony_cli): drop ANSI colour codes around log emojis

    `console::style(emoji).green()/.yellow()/.red()/.blue()` embedded raw ANSI
    escapes in the message string. `console` force-emits them off its own TTY
    detection, which disagrees with the tracing writer, so they leaked as literal
    `\x1b[..m` garbage around the emoji. Emit plain emojis — the glyph already
    conveys status and the tracing fmt layer still colours the level.

commit 4fef957edb
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sat May 30 08:40:54 2026 -0400

    feat: Example openbao now can do openbao  setup and better readme

commit af3205d353
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sat May 30 05:55:49 2026 -0400

    refactor(harmony_cli): defer topology prep so --list/declined runs are no-ops

    `Maestro::initialize` (hence `topology.ensure_ready()`) ran before `init`'s
    `--list` / confirmation short-circuits, so merely listing a binary's scores —
    or declining to run them — still prepared the topology (cert-manager install,
    etc.). Build the maestro unprepared and call `prepare_topology()` only once we
    commit to interpreting. Expose `Maestro::prepare_topology`; add tests proving
    `--list` skips prep while the run path triggers it.

commit 199e285e52
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Sat May 30 05:04:34 2026 -0400

    feat: Use tracing instead of logger in harmon_cli and  work on fleet_staging_install refactor to use harmony_cli properly, still some more work to do

commit fac83d853d
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Fri May 29 22:39:39 2026 -0400

    refactor(fleet-staging): use tracing instead of println for output

    Swap env_logger for tracing_subscriber (its fmt bridges the framework's
    log:: deploy-progress output) and route the install banner + step logs
    through tracing::info! — no raw println.

commit 0400e9d454
Author: Jean-Gabriel Gill-Couture <jg@nationtech.io>
Date:   Fri May 29 20:25:22 2026 -0400

    feat(fleet-staging): add OpenBao + seed FleetDeploySecrets; route operator creds through the deploy crate

    fleet_staging_install now deploys OpenBao (co-located in fleet-staging,
    cert-manager TLS at secrets-stg.<base>), configures it (fleet-deployer
    read policy), and seeds the operator's FleetDeploySecrets so the operator
    can be upgraded alone via 'harmony-fleet-deploy --from-tag'. Behavior of
    the existing bring-up is unchanged.

    Credential-TOML construction moved out of the example into
    OperatorCredentials::zitadel_jwt (deploy crate) so all callers share it.
    New openbao::cached_root_token() lets the seed reuse the root token setup
    already cached. Seeding mirrors the harmony_sso port-forward pattern.

2026-06-01 11:51:11 -04:00

5.4 KiB

Raw Permalink Blame History

Fleet Platform v0.3 — Staging to production-ready

Written 2026-05-31. Picks up after OpenBao + Zitadel + NATS + callout + operator are deployed and functional on staging (2-3 weeks old versions).

Current state

OpenBao running at secrets-stg.cb1.nationtech.io
Zitadel running at sso-stg.cb1.nationtech.io
NATS + auth callout deployed in fleet-staging namespace
Operator deployed (older version, 2-3 weeks old)
Config-driven OpenBao installer (examples/openbao)
harmony-fleet-deploy binary reads FleetDeployConfig + FleetDeploySecrets from OpenBao

Immediate next steps

1. Provision operator credentials in OpenBao

Fetch existing creds from the running cluster:

oc -n fleet-staging get secret harmony-fleet-operator-secrets -o jsonpath='{.data.credentials\.toml}' | base64 -d

Seed into OpenBao at secret/data/fleet-staging/FleetDeploySecrets:

export VAULT_ADDR=https://secrets-stg.cb1.nationtech.io
export VAULT_TOKEN=<root token>
oc -n fleet-staging get secret harmony-fleet-operator-secrets -o jsonpath='{.data.credentials\.toml}' | base64 -d \
  | jq -Rs '{value: ({operator_credentials_toml: .} | tojson)}' \
  | bao kv put secret/fleet-staging/FleetDeploySecrets -

Verify the secret is readable: bao kv get secret/fleet-staging/FleetDeploySecrets

2. Private repo deploy script

Create .envrc with minimal env:

export OPENBAO_URL=https://secrets-stg.cb1.nationtech.io
export HARMONY_CONFIG_NAMESPACE=fleet-staging
# export OPENBAO_TOKEN=<root token for now; SSO later>

Write deploy invocation (shell script or just harmony-fleet-deploy call):
```
harmony-fleet-deploy --from-tag harmony-fleet-operator-vX.Y.Z --yes
```
Commit .envrc + script to private repo (shared with teammates)

3. Execute operator upgrade

Run the deploy script from the private repo
Verify operator pod starts and connects to NATS
Verify operator reconciles existing CRs (check logs)
Confirm no regression in existing fleet functionality

4. Operator UI ingress (trivial)

Expose operator UI with TLS ingress on fleet-stg.<base_domain>
Verify the UI loads and serves the SPA
Confirm no auth gate yet (SSO is next)

Wire operator UI to Zitadel SSO at sso-stg.<base_domain>
Test login/logout flow end-to-end
Verify session persistence across page reloads
Confirm RBAC: only authorized Zitadel users can access the UI

6. Real data in UI

Replace mock device list with live device-info KV data
Replace mock deployment list with live Deployment CR data
Wire per-device drilldown to real DeviceInfo + last-heartbeat + agent version
NATS tail panel: SSE stream of device-info and device-state updates (plain text)
Verify data refreshes without manual reload

Configuration model

Environment (minimal, committed in private repo)

OPENBAO_URL=https://secrets-stg.cb1.nationtech.io
HARMONY_CONFIG_NAMESPACE=fleet-staging
# SSO auth or root token (SSO is the goal)

OpenBao (read via ConfigClient)

FleetDeployConfig (k8s namespaces, NATS URL, chart coords) at secret/data/fleet-staging/FleetDeployConfig
FleetDeploySecrets (operator creds) at secret/data/fleet-staging/FleetDeploySecrets

Missing features (post-UI)

Auth & credentials

Per-device OpenBao policies (templated policies, one role per device type)
Device identity claim in JWT (Zitadel client_id with device- prefix)
OpenBao JWT auth role granularity (extend OpenbaoJwtAuth to list of roles)
Move k8s namespaces + chart coords into ConfigClient config struct (env = only identifier + auth)

Operator capabilities

Agent upgrade path (ADR-022 exists; implementation pending)
Device enrollment flow (operator-facing runbook)
Revoke device / rotate key operations
Fleet-wide rollout strategies (canary, %-based) on top of agent-upgrade primitive

Observability

Operator logs every CR it acquires (verify output reads well)
NATS debugging one-liners in hand-off menu
Journald log streaming (currently only .status.aggregate.lastError)
Metrics dashboard (deferred until >100 devices)

Quality & hardening

Agent config-driven labels ([labels] in agent toml → DeviceInfo)
matchExpressions in selectors (currently matchLabels only)
Device.status.conditions populated from heartbeat staleness
Operator graceful degradation on bad device_id (log + skip, don't restart-loop)
Persist nats_auth_pass and issuer NKey via harmony_secret (regenerate-every-run footgun)

Refactors (deferred, non-blocking)

Decompose FleetServerScore into independent, ConfigClient-glued Scores
Move harmony/modules/fleet/ → fleet/harmony-fleet/ (ADR-021 pending)
Delete examples/fleet_staging_deploy (superseded by fleet_staging_install)
Drop K8sAnywhereTopology for ad-hoc Score execution; introduce K8sBareTopology

Principles (carried forward)

No yaml in framework code paths
Scores describe desired state; topologies expose capabilities
Cross-boundary wire types in harmony-reconciler-contracts
Never ship untested code
Prove claims about upstream before blaming upstream
Design the brick before moving the brick

5.4 KiB Raw Permalink Blame History

Fleet Platform v0.3 — Staging to production-ready

Current state

Immediate next steps

1. Provision operator credentials in OpenBao

2. Private repo deploy script

3. Execute operator upgrade

4. Operator UI ingress (trivial)

5. SSO login flow

6. Real data in UI

Configuration model

Environment (minimal, committed in private repo)

OpenBao (read via ConfigClient)

Missing features (post-UI)

Auth & credentials

Operator capabilities

Observability

Quality & hardening

Refactors (deferred, non-blocking)

Principles (carried forward)

5.4 KiB

Raw Permalink Blame History