Files
harmony/ROADMAP/09-sso-config-hardening.md
Jean-Gabriel Gill-Couture c687d4e6b3 docs: add Phase 9 (SSO + Config Hardening) to roadmap
New roadmap phase covering the hardening path for the SSO config
management stack: builder pattern for OpenbaoSecretStore, ZitadelScore
PG readiness fix, CoreDNSRewriteScore, integration tests, and future
capability traits.

Updates current state to reflect implemented Zitadel OIDC integration
and harmony_sso example.
2026-03-30 07:37:24 -04:00

6.2 KiB

Phase 9: SSO + Config System Hardening

Goal

Make the Zitadel + OpenBao SSO config management stack production-ready, well-tested, and reusable across deployments. The harmony_sso example demonstrates the full loop: deploy infrastructure, authenticate via SSO, store and retrieve config -- all in one cargo run.

Current State (as of feat/opnsense-codegen)

The SSO example works end-to-end:

  • k3d cluster + OpenBao + Zitadel deployed via Scores
  • OpenbaoSetupScore: init, unseal, policies, userpass, JWT auth
  • ZitadelSetupScore: project + device-code app provisioning via Management API (PAT auth)
  • JWT exchange: Zitadel id_token → OpenBao client token via /v1/auth/jwt/login
  • Device flow triggers in terminal, user logs in via browser, config stored in OpenBao KV v2
  • CoreDNS patched for in-cluster hostname resolution (K3sFamily only)
  • Discovery cache invalidation after CRD installation
  • Session caching with TTL

What's solid

  • Score composition: 4 Scores orchestrate the full stack in ~280 lines
  • Config trait: clean Serialize + Deserialize + JsonSchema, developer doesn't see OpenBao or Zitadel
  • Auth chain transparency: token → cached → OIDC device flow → userpass, right thing happens
  • Idempotency: all Scores safe to re-run, cached sessions skip login

What needs work

See tasks below.

Tasks

9.1 Builder pattern for OpenbaoSecretStore — HIGH

Problem: OpenbaoSecretStore::new() has 11 positional arguments. Adding JWT params made it worse. Callers pass None, None, None, None for unused options.

Fix: Replace with a builder:

OpenbaoSecretStore::builder()
    .url("http://127.0.0.1:8200")
    .kv_mount("secret")
    .skip_tls(true)
    .zitadel_sso("http://sso.harmony.local:8080", "client-id-123")
    .jwt_auth("harmony-developer", "jwt")
    .build()
    .await?

Impact: All callers updated (lib.rs, openbao_chain example, harmony_sso example). Breaking API change.

Files: harmony_secret/src/store/openbao.rs, all callers

9.2 Fix ZitadelScore PG readiness — HIGH

Problem: ZitadelScore calls topology.get_endpoint() immediately after deploying the CNPG Cluster CR. The PG -rw service takes 15-30s to appear. This forces a retry loop in the caller (the example).

Fix: Add a wait loop inside ZitadelScore's interpret, after topology.deploy(&pg_config), that polls for the -rw service to exist before calling get_endpoint(). Use K8sClient::get_resource::<Service>() with a poll loop.

Impact: Eliminates the retry wrapper in the harmony_sso example and any other Zitadel consumer.

Files: harmony/src/modules/zitadel/mod.rs

9.3 CoreDNSRewriteScore — MEDIUM

Problem: CoreDNS patching logic lives in the harmony_sso example. It's a general pattern: any service with ingress-based Host routing needs in-cluster DNS resolution.

Fix: Extract into harmony/src/modules/k8s/coredns.rs as a proper Score:

pub struct CoreDNSRewriteScore {
    pub rewrites: Vec<(String, String)>,  // (hostname, service FQDN)
}
impl<T: Topology + K8sclient> Score<T> for CoreDNSRewriteScore { ... }

K3sFamily only. No-op on OpenShift. Idempotent.

Files: harmony/src/modules/k8s/coredns.rs (new), harmony/src/modules/k8s/mod.rs

9.4 Integration tests for Scores — MEDIUM

Problem: Zero tests for OpenbaoSetupScore, ZitadelSetupScore, CoreDNSRewriteScore. The Scores are testable against a running k3d cluster.

Fix: Add #[ignore] integration tests that require a running cluster:

  • test_openbao_setup_score: deploy OpenBao + run setup, verify KV works
  • test_zitadel_setup_score: deploy Zitadel + run setup, verify project/app exist
  • test_config_round_trip: store + retrieve config via SSO-authenticated OpenBao

Run with cargo test -- --ignored after deploying the example.

Files: harmony/tests/integration/ (new directory)

9.5 Remove resolve() DNS hack — LOW

Problem: ZitadelOidcAuth::http_client() hardcodes resolve(host, 127.0.0.1:port). This only works for local k3d development.

Fix: Make it configurable. Add an optional resolve_to: Option<SocketAddr> field to ZitadelOidcAuth. The example passes Some(127.0.0.1:8080) for k3d; production passes None (uses real DNS). Or better: detect whether the host resolves and only apply the override if it doesn't.

Files: harmony_secret/src/store/zitadel.rs

9.6 Typed Zitadel API client — LOW

Problem: ZitadelSetupScore uses hand-written JSON with string parsing for Management API calls. No type safety on request/response.

Fix: Create typed request/response structs for the Management API v1 endpoints used (projects, apps, users). Use serde for serialization. This doesn't need to be a full API client -- just the endpoints we use.

Files: harmony/src/modules/zitadel/api.rs (new)

9.7 Capability traits for secret vault + identity — FUTURE

Problem: OpenbaoScore and ZitadelScore are tool-specific. No capability abstraction for "I need a secret vault" or "I need an identity provider".

Fix: Design SecretVault and IdentityProvider capability traits on topologies. This is a significant architectural decision that needs an ADR.

Blocked by: Real-world use of a second implementation (e.g., HashiCorp Vault, Keycloak) to validate the abstraction boundary.

9.8 Auto-unseal for OpenBao — FUTURE

Problem: Every pod restart requires manual unseal. OpenbaoSetupScore handles this, but requires re-running the Score.

Fix: Configure Transit auto-unseal (using a second OpenBao/Vault instance) or cloud KMS auto-unseal. This is an operational concern that should be configurable in OpenbaoSetupScore.

Relationship to Other Phases

  • Phase 1 (config crate): SSO flow builds directly on harmony_config + StoreSource<OpenbaoSecretStore>. Phase 1 task 1.4 is now complete via the harmony_sso example.
  • Phase 2 (migrate to harmony_config): The 19 SecretManager call sites should migrate to ConfigManager with the OpenbaoSecretStore backend. The SSO flow validates this pattern works.
  • Phase 5 (E2E tests): The harmony_sso example is a candidate for the first E2E test -- it deploys k3d, exercises multiple Scores, and verifies config storage.