New roadmap phase covering the hardening path for the SSO config management stack: builder pattern for OpenbaoSecretStore, ZitadelScore PG readiness fix, CoreDNSRewriteScore, integration tests, and future capability traits. Updates current state to reflect implemented Zitadel OIDC integration and harmony_sso example.
126 lines
6.2 KiB
Markdown
126 lines
6.2 KiB
Markdown
# Phase 9: SSO + Config System Hardening
|
|
|
|
## Goal
|
|
|
|
Make the Zitadel + OpenBao SSO config management stack production-ready, well-tested, and reusable across deployments. The `harmony_sso` example demonstrates the full loop: deploy infrastructure, authenticate via SSO, store and retrieve config -- all in one `cargo run`.
|
|
|
|
## Current State (as of `feat/opnsense-codegen`)
|
|
|
|
The SSO example works end-to-end:
|
|
- k3d cluster + OpenBao + Zitadel deployed via Scores
|
|
- `OpenbaoSetupScore`: init, unseal, policies, userpass, JWT auth
|
|
- `ZitadelSetupScore`: project + device-code app provisioning via Management API (PAT auth)
|
|
- JWT exchange: Zitadel id_token → OpenBao client token via `/v1/auth/jwt/login`
|
|
- Device flow triggers in terminal, user logs in via browser, config stored in OpenBao KV v2
|
|
- CoreDNS patched for in-cluster hostname resolution (K3sFamily only)
|
|
- Discovery cache invalidation after CRD installation
|
|
- Session caching with TTL
|
|
|
|
### What's solid
|
|
|
|
- **Score composition**: 4 Scores orchestrate the full stack in ~280 lines
|
|
- **Config trait**: clean `Serialize + Deserialize + JsonSchema`, developer doesn't see OpenBao or Zitadel
|
|
- **Auth chain transparency**: token → cached → OIDC device flow → userpass, right thing happens
|
|
- **Idempotency**: all Scores safe to re-run, cached sessions skip login
|
|
|
|
### What needs work
|
|
|
|
See tasks below.
|
|
|
|
## Tasks
|
|
|
|
### 9.1 Builder pattern for `OpenbaoSecretStore` — HIGH
|
|
|
|
**Problem**: `OpenbaoSecretStore::new()` has 11 positional arguments. Adding JWT params made it worse. Callers pass `None, None, None, None` for unused options.
|
|
|
|
**Fix**: Replace with a builder:
|
|
```rust
|
|
OpenbaoSecretStore::builder()
|
|
.url("http://127.0.0.1:8200")
|
|
.kv_mount("secret")
|
|
.skip_tls(true)
|
|
.zitadel_sso("http://sso.harmony.local:8080", "client-id-123")
|
|
.jwt_auth("harmony-developer", "jwt")
|
|
.build()
|
|
.await?
|
|
```
|
|
|
|
**Impact**: All callers updated (lib.rs, openbao_chain example, harmony_sso example). Breaking API change.
|
|
|
|
**Files**: `harmony_secret/src/store/openbao.rs`, all callers
|
|
|
|
### 9.2 Fix ZitadelScore PG readiness — HIGH
|
|
|
|
**Problem**: `ZitadelScore` calls `topology.get_endpoint()` immediately after deploying the CNPG Cluster CR. The PG `-rw` service takes 15-30s to appear. This forces a retry loop in the caller (the example).
|
|
|
|
**Fix**: Add a wait loop inside `ZitadelScore`'s interpret, after `topology.deploy(&pg_config)`, that polls for the `-rw` service to exist before calling `get_endpoint()`. Use `K8sClient::get_resource::<Service>()` with a poll loop.
|
|
|
|
**Impact**: Eliminates the retry wrapper in the harmony_sso example and any other Zitadel consumer.
|
|
|
|
**Files**: `harmony/src/modules/zitadel/mod.rs`
|
|
|
|
### 9.3 `CoreDNSRewriteScore` — MEDIUM
|
|
|
|
**Problem**: CoreDNS patching logic lives in the harmony_sso example. It's a general pattern: any service with ingress-based Host routing needs in-cluster DNS resolution.
|
|
|
|
**Fix**: Extract into `harmony/src/modules/k8s/coredns.rs` as a proper Score:
|
|
```rust
|
|
pub struct CoreDNSRewriteScore {
|
|
pub rewrites: Vec<(String, String)>, // (hostname, service FQDN)
|
|
}
|
|
impl<T: Topology + K8sclient> Score<T> for CoreDNSRewriteScore { ... }
|
|
```
|
|
|
|
K3sFamily only. No-op on OpenShift. Idempotent.
|
|
|
|
**Files**: `harmony/src/modules/k8s/coredns.rs` (new), `harmony/src/modules/k8s/mod.rs`
|
|
|
|
### 9.4 Integration tests for Scores — MEDIUM
|
|
|
|
**Problem**: Zero tests for `OpenbaoSetupScore`, `ZitadelSetupScore`, `CoreDNSRewriteScore`. The Scores are testable against a running k3d cluster.
|
|
|
|
**Fix**: Add `#[ignore]` integration tests that require a running cluster:
|
|
- `test_openbao_setup_score`: deploy OpenBao + run setup, verify KV works
|
|
- `test_zitadel_setup_score`: deploy Zitadel + run setup, verify project/app exist
|
|
- `test_config_round_trip`: store + retrieve config via SSO-authenticated OpenBao
|
|
|
|
Run with `cargo test -- --ignored` after deploying the example.
|
|
|
|
**Files**: `harmony/tests/integration/` (new directory)
|
|
|
|
### 9.5 Remove `resolve()` DNS hack — LOW
|
|
|
|
**Problem**: `ZitadelOidcAuth::http_client()` hardcodes `resolve(host, 127.0.0.1:port)`. This only works for local k3d development.
|
|
|
|
**Fix**: Make it configurable. Add an optional `resolve_to: Option<SocketAddr>` field to `ZitadelOidcAuth`. The example passes `Some(127.0.0.1:8080)` for k3d; production passes `None` (uses real DNS). Or better: detect whether the host resolves and only apply the override if it doesn't.
|
|
|
|
**Files**: `harmony_secret/src/store/zitadel.rs`
|
|
|
|
### 9.6 Typed Zitadel API client — LOW
|
|
|
|
**Problem**: `ZitadelSetupScore` uses hand-written JSON with string parsing for Management API calls. No type safety on request/response.
|
|
|
|
**Fix**: Create typed request/response structs for the Management API v1 endpoints used (projects, apps, users). Use `serde` for serialization. This doesn't need to be a full API client -- just the endpoints we use.
|
|
|
|
**Files**: `harmony/src/modules/zitadel/api.rs` (new)
|
|
|
|
### 9.7 Capability traits for secret vault + identity — FUTURE
|
|
|
|
**Problem**: `OpenbaoScore` and `ZitadelScore` are tool-specific. No capability abstraction for "I need a secret vault" or "I need an identity provider".
|
|
|
|
**Fix**: Design `SecretVault` and `IdentityProvider` capability traits on topologies. This is a significant architectural decision that needs an ADR.
|
|
|
|
**Blocked by**: Real-world use of a second implementation (e.g., HashiCorp Vault, Keycloak) to validate the abstraction boundary.
|
|
|
|
### 9.8 Auto-unseal for OpenBao — FUTURE
|
|
|
|
**Problem**: Every pod restart requires manual unseal. `OpenbaoSetupScore` handles this, but requires re-running the Score.
|
|
|
|
**Fix**: Configure Transit auto-unseal (using a second OpenBao/Vault instance) or cloud KMS auto-unseal. This is an operational concern that should be configurable in `OpenbaoSetupScore`.
|
|
|
|
## Relationship to Other Phases
|
|
|
|
- **Phase 1** (config crate): SSO flow builds directly on `harmony_config` + `StoreSource<OpenbaoSecretStore>`. Phase 1 task 1.4 is now **complete** via the harmony_sso example.
|
|
- **Phase 2** (migrate to harmony_config): The 19 `SecretManager` call sites should migrate to `ConfigManager` with the OpenbaoSecretStore backend. The SSO flow validates this pattern works.
|
|
- **Phase 5** (E2E tests): The harmony_sso example is a candidate for the first E2E test -- it deploys k3d, exercises multiple Scores, and verifies config storage.
|