NationTech/harmony

feat/ceph-score #297

Open

stremblay wants to merge 19 commits from feat/ceph-score into master

Author	SHA1	Message	Date
Sylvain Tremblay	14786fc03e	fix(k8s+storage/ceph): force-conflicts on Rook CR applies for re-run safety Some checks failed Run Check Script / check (pull_request) Failing after 59s Details Server-Side Apply rejects re-applies of a resource whose fields another field manager has taken ownership of. Rook is exactly such a manager: after reconciling a CephCluster, it claims ownership of fields like .spec.mgr.modules (the operator dynamically toggles modules) and likely .spec.storage.* (discovered nodes), .spec.dashboard.* (port assignments), etc. Re-running the example against an existing cluster therefore failed: ApiError: Apply failed with 1 conflict: conflict with "rook" using ceph.rook.io/v1: .spec.mgr.modules The kube-rs apply flow used by K8sResourceScore was hardcoding `PatchParams::apply(FIELD_MANAGER)` without `.force`. apply_dynamic_many already supports force_conflicts but the typed path didn't expose it. Changes: - K8sResourceScore gains a `force_conflicts: bool` field (default false, so all existing call sites keep their semantics) plus a chainable builder `with_force_conflicts(true)`. When set, execute() round-trips each typed resource through serde_json to DynamicObject and routes via apply_dynamic_many with force=true. - RookCephClusterScore opts in via with_force_conflicts(true) on every Rook CR apply (CephCluster, CephBlockPool, CephFilesystem, CephObjectStore, CephObjectStoreUser). The toolbox Deployment and auto-generated StorageClasses keep the default (no force) — they're only managed by Harmony, no other field manager to conflict with. For declarative IaC this is the correct semantic: Harmony's declared state is authoritative; any operator-side mutations to fields we set get overridden on the next reconcile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:47:19 -04:00
Sylvain Tremblay	2ffa717a64	fix(storage/ceph): ship rook-ceph-tools as a typed Deployment, not via Helm The previous fix passed --set toolbox.enabled=true to the rook-ceph operator chart. That was wrong: in Rook v1.19 the toolbox is no longer part of the operator chart — it's a standalone manifest at deploy/examples/toolbox.yaml. The Helm value was silently ignored, so the rook-ceph-tools Deployment was never created. Symptom on a real install: cluster came up healthy (mons + mgrs + OSDs all Running) but `oc -n rook-ceph get deploy/rook-ceph-tools` returned NotFound, and RookCephClusterScore's wait_for_toolbox_ready timed out after 10 minutes. This commit: - Adds a new toolbox.rs module that ports the canonical rook/rook@v1.19.5 toolbox.yaml verbatim to a typed k8s_openapi::Deployment, including the inline toolbox.sh bash script (~50 lines) that re-renders /etc/ceph/ceph.conf when mon endpoints change. - Sources the container image from the CephCluster spec's cephVersion.image so the toolbox stays in lockstep with the cluster's Ceph version automatically — no second pin to keep in sync. - Has RookCephClusterScore apply the typed Deployment via K8sResourceScore::single immediately after applying CephCluster, then waits for it to be Ready as before. - Removes the now-dead enable_toolbox field and toolbox.enabled Helm value from RookCephOperatorScore, plus the misleading doc claim that the chart deploys the toolbox. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:26:54 -04:00
Sylvain Tremblay	28bcedf9d0	fix(storage/ceph): pin Rook chart + Ceph image to canonical 1.19.5 pairing The previous defaults shipped an unpinned Helm chart_version and a guessed Ceph image tag ("v19.2.3" — no build suffix, no source-of-truth reference). Both are unacceptable for a production install path. Replaced with the official pairing per the Rook 1.19 documentation: - RookCephOperatorScore::default_okd().chart_version = Some("v1.19.5") Latest stable release of the rook-ceph Helm chart at https://charts.rook.io/release as of 2026-05; verified via the chart repo's index.yaml. - CephVersionSpec::default().image = "quay.io/ceph/ceph:v19.2.3-20250717" The full version+build tag the Rook 1.19 upgrade docs explicitly recommend for production at https://rook.io/docs/rook/v1.19/Upgrade/ceph-upgrade/, with the date-stamped suffix that pins an immutable container image. Pinning here means heterogeneous-daemon-version scenarios are impossible by construction, and upgrades become a deliberate change to this code rather than an unobservable container pull side-effect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:59:23 -04:00
Sylvain Tremblay	e3e07e739d	fix(storage/ceph): wait for operator + CRD discovery before applying CephCluster helm install returns once chart resources are created — not once the operator Deployment is Ready, and not once the API server's discovery cache has picked up the Rook CRDs. The kube-rs client that K8sAnywhereTopology hands out is shared and OnceCell-initialized, so its own discovery cache was populated before the chart added any ceph.rook.io/v1 resources. Applying CephCluster immediately after the operator install therefore tended to fail with "Cannot resolve GVK ceph.rook.io/v1/CephCluster". This is the same race CNPG handles at postgresql/score_k8s.rs:180-203 via wait_until_deployment_ready + wait_for_crd + invalidate_discovery. RookCephClusterScore now does the same dance, at the top of execute(), before any CR apply: 1. wait_until_deployment_ready("rook-ceph-operator", 300s) 2. wait_for_crd("cephclusters.ceph.rook.io", 60s) 3. invalidate_discovery() 4. Apply CephCluster 5. (existing) wait for toolbox ready 6. (existing) wait for HEALTH_OK 7. Apply pools / fs / object stores / users The subsequent pool/fs/object-store/user CRD applies happen many minutes later (after HEALTH_OK), by which point the discovery cache has long since refreshed — no per-apply invalidation needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:55:20 -04:00
Sylvain Tremblay	a661b1d078	fix(examples): drop unneeded env vars from install_rook_ceph/env.sh Two vars in the previous env.sh were either dead code or actively broken for this example's flow: HARMONY_PROFILE Only read by K8sAnywhereTopology::current_target(), which is only invoked by Scores implementing MultiTargetTopology (ntfy, application packaging, app monitoring). None of the Scores in install_rook_ceph require that trait, so the value is never read and the panic case is never reached. Removed. HARMONY_USE_SYSTEM_KUBECONFIG=true Setting this to true is actively worse: try_get_or_install_k8s_client hits `todo!()` at k8s_anywhere.rs:900 when this branch is taken. The correct way to point at an existing kubeconfig is the standard KUBECONFIG env var (or the default $HOME/.kube/config fallback in get_kube_config_path()). Removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:49:16 -04:00
Sylvain Tremblay	f748e6ba48	feat(examples): add env.sh for install_rook_ceph (existing-cluster mode) The example is meant to target a real pico OKD / external cluster, not a throwaway local K3D. Default `K8sAnywhereConfig::from_env()` reads HARMONY_USE_LOCAL_K3D and treats unset/true as "spin up a local K3D" — which is the wrong behavior here. env.sh sets: - HARMONY_USE_LOCAL_K3D=false force external cluster - HARMONY_PROFILE=staging required when use_local_k3d=false (current_target() panics otherwise) - HARMONY_USE_SYSTEM_KUBECONFIG=true use $HOME/.kube/config - HARMONY_SECRET_NAMESPACE/STORE/DATABASE_URL per-example state - RUST_LOG=harmony=debug to see the wait-loop progress Leaves KUBECONFIG and HARMONY_K8S_CONTEXT commented-out as overrides the user can uncomment when their kubeconfig isn't in the default location. Usage: source env.sh && cargo run -p example-install-rook-ceph Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:45:32 -04:00
Sylvain Tremblay	914d398245	chore(examples): adopt 'harmony-s3' as default S3 user name in install_rook_ceph Rename the CephObjectStoreUser const from "harmony-default-user" to the shorter "harmony-s3" that the user prefers as the default. Affects the auto-generated credentials Secret name, which becomes `rook-ceph-object-user-ceph-objectstore-harmony-s3`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:45:13 -04:00
Sylvain Tremblay	94614f874c	fix(storage/ceph): wait for toolbox + HEALTH_OK in RookCephClusterScore K8sResourceScore returns once the API server has accepted a CR — not once the operator has reconciled it. So previously, RookCephClusterScore's interpret() returned in ~5 seconds while the actual cluster was still 2-15 minutes from being usable, causing the immediately-following CephVerifyClusterHealth to fail with "rook-ceph-tools not found" or HEALTH_WARN ≠ HEALTH_OK on a single-shot check. After applying the CephCluster CR, the Score now waits for: 1. The rook-ceph-tools Deployment to have ≥1 ready replica (10 min timeout). Gating exec on this is mandatory because exec_app_capture_output panics (`.expect("No matching pod")`) if called when no toolbox pod exists yet. 2. `ceph health` to return HEALTH_OK (20 min timeout). Fresh clusters sit in HEALTH_WARN for a few minutes while mons reach quorum, mgrs come up, and OSDs bootstrap their PGs. The wait logs every status transition so the user can tell what's happening. Only after both waits succeed does the Score apply the dependent CRs (block pools, filesystems, object stores, users). Failing fast at the cluster stage is better than applying CRs the operator can never reconcile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:43:18 -04:00
Sylvain Tremblay	b1bbc78331	fix(storage/ceph): enable rook-ceph-tools by default in RookCephOperatorScore The Rook operator Helm chart ships toolbox.enabled=false by default, so the rook-ceph-tools Deployment is never created. That breaks two downstream consumers: - CephVerifyClusterHealth, which looks up the Deployment and execs `ceph health` inside it - RookCephClusterScore's new post-apply readiness wait (next commit), which polls the same path Add an `enable_toolbox: bool` field on RookCephOperatorScore (default true via both default_okd() and default_k8s()) that sets the Helm value `toolbox.enabled` to the requested string. Users who genuinely don't want the toolbox can opt out, but the typical Harmony flow needs it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:42:57 -04:00
Sylvain Tremblay	db398f0dc9	feat(examples): apply pico-OKD OSD tuning via cephConfig in install_rook_ceph Wires four conservative OSD config keys into the cluster spec so that on a small pico OKD, recovery storms don't starve client I/O: osd_max_backfills=1, osd_recovery_max_active=1, osd_recovery_op_priority=1, osd_mclock_profile=high_client_ops Leaves a commented-out alternative for disabling mclock entirely (osd_op_queue=wpq) if the mclock scheduler turns out to be the culprit. These are NOT defaults baked into RookCephClusterScore — they live in the example because they're specific to the pico use case. Adjust or remove per your hardware. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:27:36 -04:00
Sylvain Tremblay	fc37b08a6f	feat(storage/ceph): typed cephConfig field + set_config builder on CephClusterSpec Adds the missing surface to drive Rook's declarative centralized config without dropping out to imperative `ceph config set` calls in the toolbox. The new `ceph_config: Option<BTreeMap<String, BTreeMap<String, String>>>` field on CephClusterSpec mirrors the Rook v1.18 `spec.cephConfig` shape: outer key is the Ceph "WHO" target ("global", "osd.", "mon.", "mgr.", "client.rgw.<store>", "osd.0", ...), inner is `option-name -> value`. All values are strings per Rook (Ceph parses them). Rook applies these after MONs reach quorum and re-applies on drift. `CephClusterSpec::set_config(who, key, value)` is a chainable helper that lazily allocates the maps so callers can write `.set_config("osd.", "osd_max_backfills", "1")` instead of building the nested BTreeMaps by hand. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:27:26 -04:00
Sylvain Tremblay	521781798c	feat(examples): create default S3 user in install_rook_ceph The example now provisions a CephObjectStoreUser named "harmony-default-user" attached to the ceph-objectstore. After the run, the user's S3 credentials are available in the rook-ceph-object-user-ceph-objectstore-harmony-default-user Secret in the rook-ceph namespace — no manual radosgw-admin or YAML steps required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:20:26 -04:00
Sylvain Tremblay	56bf08b3bc	feat(storage/ceph): typed CephObjectStoreUser CRD + apply via cluster Score Adds the missing piece for actually using the S3 endpoint Rook stands up: a typed CephObjectStoreUser with full spec coverage (capabilities, quotas, cluster_namespace) and a convenience `for_store` constructor. Rook materializes the user's S3 access credentials into a Secret named `rook-ceph-object-user-<store>-<user>` with base64-encoded `AccessKey` / `SecretKey` data keys. The new `credentials_secret_name()` helper returns that name programmatically so callers don't have to assemble the string. Threads through RookCephClusterScore as a new `object_store_users: Vec<CephObjectStoreUser>` field, applied as step 5 of the CR sequence (after the object stores they depend on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:20:19 -04:00
Sylvain Tremblay	96ebbd5f3e	chore(examples): comment out S3 Ingress in install_rook_ceph Keep the edge-TLS Ingress block in the source as a reference but disable it by default — running the example shouldn't require a pre-provisioned TLS Secret or an ingress controller. Uncomment the import lines, the const declarations, the Ingress construction, and the K8sResourceScore entry in the `scores` vec to re-enable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:08:41 -04:00
Sylvain Tremblay	9b0481ad08	feat(examples): expose S3 endpoint via edge-TLS Ingress in install_rook_ceph Adds a Kubernetes Ingress in front of the RGW Service (port 8080) with TLS terminated at the edge — the backend RGW stays on HTTP, only reachable intra-cluster. Hostname and TLS Secret name are configurable via the S3_HOSTNAME and S3_TLS_SECRET consts at the top of main.rs. The TLS Secret must exist in the rook-ceph namespace before running the example (e.g. created by cert-manager or a manual `kubectl create secret tls`). The example does not create it — cert material can't be shipped in a repo. Built as a raw k8s_openapi::Ingress applied via K8sResourceScore::single because harmony's K8sIngressScore currently emits HTTP-only Ingresses and doesn't expose a TLS field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:00:43 -04:00
Sylvain Tremblay	d7315812ca	feat(examples): deploy CephObjectStore (S3) in install_rook_ceph Extends the example to stand up a 2-instance RGW gateway alongside the block pool. The CephObjectStore CR uses the default replicated metadata and data pools (size=3) and Rook's port 8080 to dodge OKD's <1024 bind restriction. The operator-created Service exposes the S3 endpoint at rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:8080. Adds k8s-openapi to the example's deps for ObjectMeta — needed now that the example builds a CR directly instead of relying solely on default_okd(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:55:12 -04:00
Sylvain Tremblay	8a3a6e7107	feat(examples): add install_rook_ceph end-to-end example Minimal example wiring the new RookCephOperatorScore + RookCephClusterScore against K8sAnywhereTopology, then chaining the existing CephVerifyClusterHealth Day-2 Score to close the install→verify loop. Picked up automatically by the examples/* workspace wildcard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 09:23:43 -04:00
Sylvain Tremblay	2fec852294	feat(storage/ceph): add RookCephOperator + RookCephCluster install Scores Closes the install gap on the Ceph module: previously the storage/ceph/ module only contained Day-2 Scores (CephVerifyClusterHealth, CephRemoveOsd) which assumed a Rook-Ceph cluster already existed in the rook-ceph namespace. There was no Score that actually installed one. Adds two Scores, mirroring the CNPG split-architecture: - RookCephOperatorScore wraps HelmChartScore against the upstream charts.rook.io/release repo. Rook's docs explicitly recommend Helm on OpenShift because the chart auto-creates the SecurityContextConstraints OKD requires. default_okd() sets hostpathRequiresPrivileged=true (mandatory under OKD's SELinux restricted policy) and enables both the RBD and CephFS CSI drivers. - RookCephClusterScore applies the typed CephCluster, CephBlockPool, CephFilesystem, and CephObjectStore CRs added in the previous commit via K8sResourceScore::single, plus auto-generates the matching rook-ceph-block-* (RBD) and rook-cephfs-* (CephFS) StorageClass resources. default_okd() ships a 3-mon / 2-mgr / SSL-dashboard / size=3-replicated topology with useAllNodes+useAllDevices. ODF (ocs.openshift.io/StorageCluster) was rejected: it requires the registry.redhat.io pull-secret and isn't supported on OKD. The CRDs themselves are installed transparently by the operator's Helm chart — re-typing those in Rust would mean maintaining two sources of truth for the same OpenAPI schema, so we don't. Type-safety lives at the user-facing layer (CephCluster/Block/FS/Object specs), which is where the value sits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 09:23:31 -04:00
Sylvain Tremblay	52c258704e	feat(storage/ceph): add typed Rook-Ceph v1 CRDs Introduce kube::CustomResource-derived Rust types for the four ceph.rook.io/v1 kinds we'll need to drive a Ceph install end-to-end: CephCluster, CephBlockPool, CephFilesystem, CephObjectStore. Shared spec primitives (PoolSpec, ReplicatedSpec, ErasureCodedSpec, FailureDomain, MetadataServerSpec, PlacementSpec, VolumeClaimTemplate) live in crd/shared.rs. These are stand-alone data types with no Score impl yet — follow-up commits will add the operator-install and cluster-apply Scores that consume them. Mirrors the typed-CRD pattern from postgresql/cnpg/crd.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 09:22:48 -04:00