fleet-deploy: - Rename harmony-fleet-release binary to harmony-fleet-publish - Route all deploy settings through ConfigClient (env → OpenBao → prompt) instead of bespoke flags; seed FleetDeploySecrets via OpenBao - Rename HARMONY_SECRET_NAMESPACE to HARMONY_CONFIG_NAMESPACE - Append -chart to the Helm chart artifact name so it no longer collides with the Docker image in Harbor (application/vnd.cncf.helm.config.v1+json) harmony_cli: - Switch from log to tracing for structured output - Defer topology prep so --list and declined runs are no-ops - Drop ANSI colour codes around log emojis - Init cli logger in fleet deploy binary openbao: - Scope unseal-keys cache file per instance - Example gains setup capability and updated README roadmap: - Add unified CLI design document (ROADMAP/13-unified-cli.md) - Update v0.3 fleet platform plan Squashed commit of the following: commit36d9d9aaecMerge:12c8d9cfe7148aa8Author: johnride <jg@nationtech.io> Date: Mon Jun 1 15:42:56 2026 +0000 Merge pull request 'fix: fleet operator chart name was conflicting with the container name. Append -chart to the chart name' (#317) from fix/fleet-operator-chart-name into chore/rename-release-to-publish Reviewed-on: #317 commite7148aa85fAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Mon Jun 1 11:35:15 2026 -0400 fix: fleet operator chart name was conflicting with the container name. Append -chart to the chart name commit12c8d9cfa0Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Mon Jun 1 11:12:23 2026 -0400 feat: Init cli logger in fleet deploy commitedb62668b6Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sun May 31 12:56:36 2026 -0400 doc: Roadmap entry for cli design and implementation commitf2ecccb4abAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sun May 31 12:32:19 2026 -0400 refactor(fleet-deploy): rename harmony-fleet-release to harmony-fleet-publish Deploy/publish wording is more intuitive than deploy/release. commit2e9052b217Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sun May 31 10:12:54 2026 -0400 fix(openbao): remove extra blank line in example Pre-existing formatting issue caught by cargo fmt --check. commitf7299ebe2bAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sun May 31 09:13:39 2026 -0400 refactor(fleet-deploy): rename HARMONY_SECRET_NAMESPACE to HARMONY_CONFIG_NAMESPACE The env var name was a misnomer — ConfigClient resolves both config and secrets, not just secrets. The struct field was already config_namespace. Legacy SecretManager keeps the old var; this forces migration to ConfigClient for new code. commitd39aa15152Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sun May 31 09:06:20 2026 -0400 feat: fleet deploy uses configuration from configclient for all settings, update the 0_3 plan commit57d056fcedAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sat May 30 11:07:03 2026 -0400 fix(openbao): scope unseal-keys cache file per instance The root token + unseal keys were written to a single fixed `~/.local/share/harmony/openbao/unseal-keys.json`, so deploying a second OpenBao instance (different namespace/release) overwrote the first's keys — after which the first could never be unsealed. Key the file by namespace+release (`unseal-keys-<ns>-<release>.json`); `cached_root_token` now takes the `OpenbaoInstance` to read the right one. commit44aa83199aAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sat May 30 11:05:30 2026 -0400 fix(harmony_cli): drop ANSI colour codes around log emojis `console::style(emoji).green()/.yellow()/.red()/.blue()` embedded raw ANSI escapes in the message string. `console` force-emits them off its own TTY detection, which disagrees with the tracing writer, so they leaked as literal `\x1b[..m` garbage around the emoji. Emit plain emojis — the glyph already conveys status and the tracing fmt layer still colours the level. commit4fef957edbAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sat May 30 08:40:54 2026 -0400 feat: Example openbao now can do openbao setup and better readme commitaf3205d353Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sat May 30 05:55:49 2026 -0400 refactor(harmony_cli): defer topology prep so --list/declined runs are no-ops `Maestro::initialize` (hence `topology.ensure_ready()`) ran before `init`'s `--list` / confirmation short-circuits, so merely listing a binary's scores — or declining to run them — still prepared the topology (cert-manager install, etc.). Build the maestro unprepared and call `prepare_topology()` only once we commit to interpreting. Expose `Maestro::prepare_topology`; add tests proving `--list` skips prep while the run path triggers it. commit199e285e52Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Sat May 30 05:04:34 2026 -0400 feat: Use tracing instead of logger in harmon_cli and work on fleet_staging_install refactor to use harmony_cli properly, still some more work to do commitfac83d853dAuthor: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Fri May 29 22:39:39 2026 -0400 refactor(fleet-staging): use tracing instead of println for output Swap env_logger for tracing_subscriber (its fmt bridges the framework's log:: deploy-progress output) and route the install banner + step logs through tracing::info! — no raw println. commit0400e9d454Author: Jean-Gabriel Gill-Couture <jg@nationtech.io> Date: Fri May 29 20:25:22 2026 -0400 feat(fleet-staging): add OpenBao + seed FleetDeploySecrets; route operator creds through the deploy crate fleet_staging_install now deploys OpenBao (co-located in fleet-staging, cert-manager TLS at secrets-stg.<base>), configures it (fleet-deployer read policy), and seeds the operator's FleetDeploySecrets so the operator can be upgraded alone via 'harmony-fleet-deploy --from-tag'. Behavior of the existing bring-up is unchanged. Credential-TOML construction moved out of the example into OperatorCredentials::zitadel_jwt (deploy crate) so all callers share it. New openbao::cached_root_token() lets the seed reuse the root token setup already cached. Seeding mirrors the harmony_sso port-forward pattern.
135 lines
5.4 KiB
Markdown
135 lines
5.4 KiB
Markdown
# Fleet Platform v0.3 — Staging to production-ready
|
|
|
|
Written 2026-05-31. Picks up after OpenBao + Zitadel + NATS + callout + operator are deployed and functional on staging (2-3 weeks old versions).
|
|
|
|
## Current state
|
|
|
|
- [x] OpenBao running at `secrets-stg.cb1.nationtech.io`
|
|
- [x] Zitadel running at `sso-stg.cb1.nationtech.io`
|
|
- [x] NATS + auth callout deployed in `fleet-staging` namespace
|
|
- [x] Operator deployed (older version, 2-3 weeks old)
|
|
- [x] Config-driven OpenBao installer (`examples/openbao`)
|
|
- [x] `harmony-fleet-deploy` binary reads `FleetDeployConfig` + `FleetDeploySecrets` from OpenBao
|
|
|
|
## Immediate next steps
|
|
|
|
### 1. Provision operator credentials in OpenBao
|
|
|
|
- [ ] Fetch existing creds from the running cluster:
|
|
```bash
|
|
oc -n fleet-staging get secret harmony-fleet-operator-secrets -o jsonpath='{.data.credentials\.toml}' | base64 -d
|
|
```
|
|
- [ ] Seed into OpenBao at `secret/data/fleet-staging/FleetDeploySecrets`:
|
|
```bash
|
|
export VAULT_ADDR=https://secrets-stg.cb1.nationtech.io
|
|
export VAULT_TOKEN=<root token>
|
|
oc -n fleet-staging get secret harmony-fleet-operator-secrets -o jsonpath='{.data.credentials\.toml}' | base64 -d \
|
|
| jq -Rs '{value: ({operator_credentials_toml: .} | tojson)}' \
|
|
| bao kv put secret/fleet-staging/FleetDeploySecrets -
|
|
```
|
|
- [ ] Verify the secret is readable: `bao kv get secret/fleet-staging/FleetDeploySecrets`
|
|
|
|
### 2. Private repo deploy script
|
|
|
|
- [ ] Create `.envrc` with minimal env:
|
|
```bash
|
|
export OPENBAO_URL=https://secrets-stg.cb1.nationtech.io
|
|
export HARMONY_CONFIG_NAMESPACE=fleet-staging
|
|
# export OPENBAO_TOKEN=<root token for now; SSO later>
|
|
```
|
|
- [ ] Write deploy invocation (shell script or just `harmony-fleet-deploy` call):
|
|
```bash
|
|
harmony-fleet-deploy --from-tag harmony-fleet-operator-vX.Y.Z --yes
|
|
```
|
|
- [ ] Commit `.envrc` + script to private repo (shared with teammates)
|
|
|
|
### 3. Execute operator upgrade
|
|
|
|
- [ ] Run the deploy script from the private repo
|
|
- [ ] Verify operator pod starts and connects to NATS
|
|
- [ ] Verify operator reconciles existing CRs (check logs)
|
|
- [ ] Confirm no regression in existing fleet functionality
|
|
|
|
### 4. Operator UI ingress (trivial)
|
|
|
|
- [ ] Expose operator UI with TLS ingress on `fleet-stg.<base_domain>`
|
|
- [ ] Verify the UI loads and serves the SPA
|
|
- [ ] Confirm no auth gate yet (SSO is next)
|
|
|
|
### 5. SSO login flow
|
|
|
|
- [ ] Wire operator UI to Zitadel SSO at `sso-stg.<base_domain>`
|
|
- [ ] Test login/logout flow end-to-end
|
|
- [ ] Verify session persistence across page reloads
|
|
- [ ] Confirm RBAC: only authorized Zitadel users can access the UI
|
|
|
|
### 6. Real data in UI
|
|
|
|
- [ ] Replace mock device list with live `device-info` KV data
|
|
- [ ] Replace mock deployment list with live `Deployment` CR data
|
|
- [ ] Wire per-device drilldown to real `DeviceInfo` + last-heartbeat + agent version
|
|
- [ ] NATS tail panel: SSE stream of `device-info` and `device-state` updates (plain text)
|
|
- [ ] Verify data refreshes without manual reload
|
|
|
|
## Configuration model
|
|
|
|
### Environment (minimal, committed in private repo)
|
|
|
|
```bash
|
|
OPENBAO_URL=https://secrets-stg.cb1.nationtech.io
|
|
HARMONY_CONFIG_NAMESPACE=fleet-staging
|
|
# SSO auth or root token (SSO is the goal)
|
|
```
|
|
|
|
### OpenBao (read via ConfigClient)
|
|
|
|
- `FleetDeployConfig` (k8s namespaces, NATS URL, chart coords) at `secret/data/fleet-staging/FleetDeployConfig`
|
|
- `FleetDeploySecrets` (operator creds) at `secret/data/fleet-staging/FleetDeploySecrets`
|
|
|
|
## Missing features (post-UI)
|
|
|
|
### Auth & credentials
|
|
|
|
- [ ] Per-device OpenBao policies (templated policies, one role per device type)
|
|
- [ ] Device identity claim in JWT (Zitadel `client_id` with `device-` prefix)
|
|
- [ ] OpenBao JWT auth role granularity (extend `OpenbaoJwtAuth` to list of roles)
|
|
- [x] Move k8s namespaces + chart coords into `ConfigClient` config struct (env = only identifier + auth)
|
|
|
|
### Operator capabilities
|
|
|
|
- [ ] Agent upgrade path (ADR-022 exists; implementation pending)
|
|
- [ ] Device enrollment flow (operator-facing runbook)
|
|
- [ ] Revoke device / rotate key operations
|
|
- [ ] Fleet-wide rollout strategies (canary, %-based) on top of agent-upgrade primitive
|
|
|
|
### Observability
|
|
|
|
- [ ] Operator logs every CR it acquires (verify output reads well)
|
|
- [ ] NATS debugging one-liners in hand-off menu
|
|
- [ ] Journald log streaming (currently only `.status.aggregate.lastError`)
|
|
- [ ] Metrics dashboard (deferred until >100 devices)
|
|
|
|
### Quality & hardening
|
|
|
|
- [ ] Agent config-driven labels (`[labels]` in agent toml → DeviceInfo)
|
|
- [ ] `matchExpressions` in selectors (currently `matchLabels` only)
|
|
- [ ] `Device.status.conditions` populated from heartbeat staleness
|
|
- [ ] Operator graceful degradation on bad device_id (log + skip, don't restart-loop)
|
|
- [ ] Persist `nats_auth_pass` and issuer NKey via `harmony_secret` (regenerate-every-run footgun)
|
|
|
|
### Refactors (deferred, non-blocking)
|
|
|
|
- [ ] Decompose `FleetServerScore` into independent, ConfigClient-glued Scores
|
|
- [ ] Move `harmony/modules/fleet/` → `fleet/harmony-fleet/` (ADR-021 pending)
|
|
- [ ] Delete `examples/fleet_staging_deploy` (superseded by `fleet_staging_install`)
|
|
- [ ] Drop `K8sAnywhereTopology` for ad-hoc Score execution; introduce `K8sBareTopology`
|
|
|
|
## Principles (carried forward)
|
|
|
|
- No yaml in framework code paths
|
|
- Scores describe desired state; topologies expose capabilities
|
|
- Cross-boundary wire types in `harmony-reconciler-contracts`
|
|
- Never ship untested code
|
|
- Prove claims about upstream before blaming upstream
|
|
- Design the brick before moving the brick
|