feat/cluster-dashboard #250

Merged
stremblay merged 19 commits from feat/cluster-dashboard into master 2026-04-16 19:48:22 +00:00
Owner

Add a ClusterDashboardsScore that deploys a complete Grafana observability stack — operator, instance, Prometheus datasource, and 8 pre-built OKD/k8s monitoring dashboards — as a single score. Includes infrastructure improvements to make Grafana installation portable across OpenShift and plain Kubernetes.

What's included

Cluster dashboards (ClusterDashboardsScore)

  • 8 pre-built dashboards: cluster overview, node health, workloads health, networking, storage, etcd, control plane, alerts & events
  • Typed Grafana/GrafanaDashboard/GrafanaDatasource CRs (no more raw YAML strings)
  • Auto-detection of cluster type: OpenShift → Route, plain k8s → Ingress
  • Prometheus datasource wired with bearer-token auth via ServiceAccount

Grafana operator install (GrafanaOperatorScore)

  • Composite score: installs the operator via Helm, auto-detects OpenShift and applies the Route RBAC the upstream chart doesn't include
  • Pinnable chart version via chart_version parameter

Helm improvements (HelmChartInterpret)

  • Pre-flight version check: if the chart is already installed at the requested version, skip (warn); if at a different version, error with a clear message; if not installed, install as before
  • Adds idempotency to all HelmChartScore users

CRD audit (crd_grafana.rs)

  • Rewrote GrafanaSpec to match the real upstream grafanas.grafana.integreatly.org/v1beta1 schema (verified against the live CRD)
  • Dropped fields that don't exist upstream (admin_user/admin_password at top-level, persistence, resources, typed GrafanaConfig)
  • Modeled .spec.config as BTreeMap<String, BTreeMap<String, String>> (matching the upstream map-of-maps for grafana.ini)
  • Added typed deployment, route, and ingress sub-trees
  • Flagged rhob_grafana.rs as a stale duplicate

Domain resolution (K8sClient::get_domain)

  • Moved ingress hostname resolution from K8sAnywhereTopology to K8sClient so both K8sAnywhereTopology and HAClusterTopology can resolve hostnames without extra trait bounds

Examples

  • examples/grafana/ — standalone Grafana operator install with pinned chart version
  • examples/cluster_dashboards/ — deploys ClusterDashboardsScore with default settings
Add a `ClusterDashboardsScore` that deploys a complete Grafana observability stack — operator, instance, Prometheus datasource, and 8 pre-built OKD/k8s monitoring dashboards — as a single score. Includes infrastructure improvements to make Grafana installation portable across OpenShift and plain Kubernetes. ### What's included **Cluster dashboards** (`ClusterDashboardsScore`) - 8 pre-built dashboards: cluster overview, node health, workloads health, networking, storage, etcd, control plane, alerts & events - Typed Grafana/GrafanaDashboard/GrafanaDatasource CRs (no more raw YAML strings) - Auto-detection of cluster type: OpenShift → Route, plain k8s → Ingress - Prometheus datasource wired with bearer-token auth via ServiceAccount **Grafana operator install** (`GrafanaOperatorScore`) - Composite score: installs the operator via Helm, auto-detects OpenShift and applies the Route RBAC the upstream chart doesn't include - Pinnable chart version via `chart_version` parameter **Helm improvements** (`HelmChartInterpret`) - Pre-flight version check: if the chart is already installed at the requested version, skip (warn); if at a different version, error with a clear message; if not installed, install as before - Adds idempotency to all `HelmChartScore` users **CRD audit** (`crd_grafana.rs`) - Rewrote `GrafanaSpec` to match the real upstream `grafanas.grafana.integreatly.org/v1beta1` schema (verified against the live CRD) - Dropped fields that don't exist upstream (`admin_user`/`admin_password` at top-level, `persistence`, `resources`, typed `GrafanaConfig`) - Modeled `.spec.config` as `BTreeMap<String, BTreeMap<String, String>>` (matching the upstream map-of-maps for `grafana.ini`) - Added typed `deployment`, `route`, and `ingress` sub-trees - Flagged `rhob_grafana.rs` as a stale duplicate **Domain resolution** (`K8sClient::get_domain`) - Moved ingress hostname resolution from `K8sAnywhereTopology` to `K8sClient` so both `K8sAnywhereTopology` and `HAClusterTopology` can resolve hostnames without extra trait bounds ### Examples - `examples/grafana/` — standalone Grafana operator install with pinned chart version - `examples/cluster_dashboards/` — deploys `ClusterDashboardsScore` with default settings
johnride added 10 commits 2026-03-16 02:17:12 +00:00
stremblay added 1 commit 2026-04-03 18:09:15 +00:00
feat: add json versions of dashboards
Some checks failed
Run Check Script / check (pull_request) Failing after 30s
b1ff4e4a0f
stremblay added 1 commit 2026-04-15 16:46:45 +00:00
feat(grafana): standalone example + OCP-aware operator install
Some checks failed
Run Check Script / check (pull_request) Failing after 27s
7f25007c3b
- Add `examples/grafana` demonstrating a standalone Grafana install via
    the existing helm-chart score, with chart version pinned and an env.sh
  - Add `chart_version` parameter to `grafana_helm_chart_score`
  - Introduce `GrafanaOperatorScore`: composite score that auto-detects
    OpenShift (via `K8sClient::get_k8s_distribution`) and applies the
    `route.openshift.io` ClusterRole/Binding the operator needs before
    running the helm install
  - Make `HelmChartInterpret` idempotent on re-run: if the release is
    already at the requested chart version, warn and skip; if at a
    different version, refuse rather than silently upgrade
  - Hand Route ownership to grafana-operator in `ClusterDashboardsScore`
    by adding `.spec.route` to the Grafana CR and dropping the manual
    Route creation
  - Rename the Grafana-to-Prometheus ServiceAccount from
    `cluster-grafana-sa` to `grafana-prometheus-datasource-sa` to avoid
    a collision with the SA grafana-operator creates for the Grafana pod
stremblay added 1 commit 2026-04-15 17:48:41 +00:00
refactor(grafana): use typed CRs, audit crd_grafana against upstream
Some checks failed
Run Check Script / check (pull_request) Failing after 21s
8ad447c42a
- Rewrite `GrafanaSpec` to match the real
    `grafanas.grafana.integreatly.org/v1beta1` schema (verified against
    the live CRD). Drop non-existent fields (top-level `admin_user`/
    `admin_password`, `ingress{enabled,hosts}`, `persistence`, top-level
    `resources`). Model `.spec.config` as nested `BTreeMap` to preserve
    arbitrary `grafana.ini` sections (including `auth.anonymous`). Add
    narrow `deployment` and `route` sub-trees.
  - Replace raw-YAML + `apply_dynamic` in `cluster_dashboards` with
    typed CRs applied via `K8sResourceScore::single`. Migrate the two
    all-`None` `GrafanaSpec` call sites to `GrafanaSpec::default()`.
  - Flag `rhob_grafana.rs` as a stale duplicate with a module-level
    warning; keep it in place since its only caller passes all `None`.
stremblay added 1 commit 2026-04-16 11:41:26 +00:00
feat(grafana): auto-detect cluster type and expose Grafana via Route or Ingress
Some checks failed
Run Check Script / check (pull_request) Failing after 11s
dd587d856d
- Move domain-resolution logic from `K8sAnywhereTopology::get_domain`
    to `K8sClient::get_domain()` in `harmony-k8s` so both
    `K8sAnywhereTopology` and `HAClusterTopology` can resolve hostnames
    without requiring the `Ingress` trait bound.
  - `ClusterDashboardsScore::create_grafana` now auto-detects the k8s
    distribution: OpenShift → `.spec.route` on the Grafana CR (as
    before), non-OpenShift → `.spec.ingress` with hostname from
    `K8sClient::get_domain`.
  - Add typed `GrafanaIngress` sub-tree to `crd_grafana.rs` matching the
    upstream CRD's standard k8s IngressSpec shape.
  - `K8sAnywhereTopology::get_domain` retains its k3d shortcut, then
    delegates to the client for all other cases.
stremblay added 1 commit 2026-04-16 18:55:47 +00:00
fix: formatting
Some checks failed
Run Check Script / check (pull_request) Failing after 12s
4985109ef3
stremblay added 1 commit 2026-04-16 18:59:23 +00:00
move some files around...
Some checks failed
Run Check Script / check (pull_request) Failing after 10s
a935dba8e8
stremblay added 2 commits 2026-04-16 19:23:40 +00:00
fix: workspace issues in Cargo.toml files
Some checks failed
Run Check Script / check (pull_request) Failing after 31s
bcacd3b51f
stremblay added 1 commit 2026-04-16 19:38:50 +00:00
fix: adding missing file
All checks were successful
Run Check Script / check (pull_request) Successful in 2m29s
b87b8a7667
stremblay merged commit bbf0881932 into master 2026-04-16 19:48:22 +00:00
stremblay deleted branch feat/cluster-dashboard 2026-04-16 19:48:25 +00:00
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: NationTech/harmony#250
No description provided.