Files
harmony/iobench/PREFLIGHT_CHECKS.md
2026-05-03 09:47:21 -04:00

5.4 KiB

Redpanda iobench preflight checklist

Cluster inspection performed: 2026-05-03 Context: maintenance window, other workloads turned off, Ceph pool at ~0 IOPS idle.


Cluster topology

  • PASS — 3 worker nodes available (wk0, wk1, wk2). Matches --replicas 3 default.
  • PASS — All 3 workers are Ready, no DiskPressure / MemoryPressure / PIDPressure.
  • PASS — Control plane nodes (cp3, cp4, cp5) have NoSchedule taint. iobench pods cannot land there.
  • PASS — All nodes carry kubernetes.io/hostname label. Anti-affinity topology key will work.
  • PASS — Worker nodes are untainted and have no scheduling restrictions.

Storage

  • PASS — StorageClass ceph-block exists, volumeBindingMode: Immediate, reclaimPolicy: Delete.
  • PASS — Ceph pool ceph-blockpool replication: size=3, min_size=2.
  • PASS — 3 OSDs all Running (osd-0, osd-1, osd-2).
  • PASS — Raw capacity: 503 GiB available. 3x50GiB PVCs = 150 GiB usable = 450 GiB raw. Leaves 53 GiB raw headroom (~10.5%). Tight but sufficient for a benchmark run.
  • PASS — Largest single-workload disk footprint: throughput with numjobs=4 x size=10G = 40 GiB per PVC. Fits in 50 GiB PVC with 10 GiB headroom for logs and filesystem overhead.
  • PASS — Pool is idle ("nothing is going on"), OSD commit/apply latency showing 0 ms. Confirms maintenance window baseline.
  • PASS — All 3 disks are actually Intel SSDs (misidentified as HDD by Ceph due to HP RAID controller passthrough). No mixed-media concern.
  • PASS (non-blocking) — Ceph health is HEALTH_WARN: too many PGs per OSD (265 > max 250). This is a pre-existing tuning issue unrelated to the benchmark. Does not affect correctness of results. Note: if this were a data-integrity warning (e.g. HEALTH_WARN: degraded PGs), the benchmark must not proceed.

Namespace and resource constraints

  • PASSdefault namespace exists.
  • PASS — No ResourceQuota or LimitRange in default namespace. Pod creation won't be blocked.
  • PASS — No leftover iobench-redpanda StatefulSet or PVCs in default namespace. Clean slate.

Container image

  • PASSjuicedata/fio:latest pulls successfully and runs on this cluster.
  • PASS — fio version 3.18 confirmed. Supports fdatasync=1, log_avg_msec, write_lat_log, write_iops_log, --output-format=json.
  • PASS — Image contains sh, date (needed for wall-clock barrier), and tar (needed for kubectl cp).

Clock synchronization (barrier correctness)

  • PASS — Node heartbeat timestamps are within ~5s of each other. The 60-second barrier (start_at = now_epoch() + 60) provides ample margin. Barrier only fails if clock skew exceeds 60s.

Code safety review

  • PASS — All fio workloads use direct=1 (bypass page cache). Benchmark measures Ceph, not RAM.
  • PASS — All writes target /data/iobench_testfile on the mounted PVC filesystem, not a raw block device. No risk of corrupting the node OS disk.
  • PASS — Pods run with default security context (no privileged, no hostPath, no hostNetwork). Blast radius is limited to the PVCs.
  • PASSreclaimPolicy: Delete means PVCs and their backing RBD images are fully cleaned up when deleted. No storage leak after undeploy.
  • PASSdelete_resources targets only statefulset,pvc with label app=iobench-redpanda. Cannot accidentally delete unrelated resources.
  • PASS — Parallel mode prints a clear warning before starting.
  • PASS--keep-deployment flag available for debugging without re-provisioning.
  • PASS — Workloads run sequentially within each mode (throughput, then fsync_hot_path, then selftest_512k, then selftest_4k_qd1). Disk usage doesn't accumulate across workloads since they reuse the same filename.

Resource consumption during run

  • PASS — Workers have ample CPU headroom (3-6% current usage). fio with 4 jobs + iodepth 16 is not CPU-intensive on modern hardware.
  • PASS — Workers have ample memory headroom (5-10% current usage). fio with direct=1 uses minimal RAM.
  • PASS — No CPU/memory resource limits or requests set on the fio container. This is intentional — resource limits would throttle the benchmark and distort results. Acceptable during a maintenance window with other workloads off.

Risks acknowledged

  • Ceph pool impact: Parallel mode will saturate the Ceph pool. This is the point of the test. Confirmed other workloads are off and pool is at ~0 IOPS.
  • Capacity headroom is tight: 53 GiB raw remaining after PVC provisioning (~10.5%). If the cluster has background operations that consume space (e.g., snapshots, compaction), this could trigger a HEALTH_ERR: full condition. Mitigated by: maintenance window, no other workloads, and reclaimPolicy: Delete ensuring cleanup.
  • --wait=false on delete: delete_resources uses --wait=false, so undeploy returns before PVCs are fully reclaimed. This is fine — Ceph handles RBD image deletion asynchronously. But if re-running immediately after undeploy, PVCs from the previous run may still be terminating. Mitigated by: the deploy step uses kubectl apply which is idempotent.

Verdict

All checks pass. Safe to proceed with iobench redpanda --storage-class ceph-block during the current maintenance window.