All checks were successful
Run Check Script / check (pull_request) Successful in 1m27s
198 lines
7.2 KiB
Markdown
198 lines
7.2 KiB
Markdown
# harmony-node-readiness-endpoint
|
||
|
||
**A lightweight, standalone Rust service for Kubernetes node health checking.**
|
||
|
||
Designed for **bare-metal Kubernetes clusters** with external load balancers (HAProxy, OPNsense, F5, etc.).
|
||
|
||
Exposes a simple HTTP endpoint (`/health`) on each node:
|
||
|
||
- **200 OK** — node is healthy and ready to receive traffic
|
||
- **503 Service Unavailable** — node should be removed from the load balancer pool
|
||
- **500 Internal Server Error** — misconfiguration (e.g. `NODE_NAME` not set)
|
||
|
||
This project is **not dependent on Harmony**, but is commonly used as part of Harmony bare-metal Kubernetes deployments.
|
||
|
||
## Why this project exists
|
||
|
||
In bare-metal environments, external load balancers often rely on pod-level or router-level checks that can lag behind the authoritative Kubernetes `Node.status.conditions[Ready]`.
|
||
This service provides the true source-of-truth with fast reaction time.
|
||
|
||
## Available checks
|
||
|
||
| Check name | Description | Status |
|
||
|--------------------|-------------------------------------------------------------|-------------------|
|
||
| `node_ready` | Queries `Node.status.conditions[Ready]` via Kubernetes API | Implemented |
|
||
| `okd_router_1936` | Probes OpenShift router `/healthz/ready` on port 1936 | Implemented |
|
||
| `filesystem_ro` | Detects read-only mounts via `/proc/mounts` | To be implemented |
|
||
| `kubelet` | Local probe to kubelet `/healthz` (port 10248) | To be implemented |
|
||
| `container_runtime`| Socket check + runtime status | To be implemented |
|
||
| `disk_pressure` | Threshold checks on key filesystems | To be implemented |
|
||
| `network` | DNS resolution + gateway connectivity | To be implemented |
|
||
| `custom_conditions`| Reacts to extra conditions (NPD, etc.) | To be implemented |
|
||
|
||
All checks are combined with logical **AND** — any single failure results in 503.
|
||
|
||
## Behavior
|
||
|
||
### `node_ready` check — fail-open design
|
||
|
||
The `node_ready` check queries the Kubernetes API server to read `Node.status.conditions[Ready]`.
|
||
Because this service runs on the node it is checking, there are scenarios where the API server is temporarily
|
||
unreachable (e.g. during a control-plane restart). To avoid incorrectly draining a healthy node in such cases,
|
||
the check is **fail-open**: it passes (reports ready) whenever the Kubernetes API is unavailable.
|
||
|
||
| Situation | Result | HTTP status |
|
||
|------------------------------------------------------|-------------------|-------------|
|
||
| `Node.conditions[Ready] == True` | Pass | 200 |
|
||
| `Node.conditions[Ready] == False` | Fail | 503 |
|
||
| `Ready` condition absent | Fail | 503 |
|
||
| API server unreachable or timed out (1 s timeout) | Pass (assumes ready) | 200 |
|
||
| Kubernetes client initialization failed | Pass (assumes ready) | 200 |
|
||
| `NODE_NAME` env var not set | Hard error | 500 |
|
||
|
||
A warning is logged whenever the API is unavailable and the check falls back to assuming ready.
|
||
|
||
### `okd_router_1936` check
|
||
|
||
Sends `GET http://127.0.0.1:1936/healthz/ready` with a 5-second timeout.
|
||
Returns pass on any 2xx response, fail otherwise.
|
||
|
||
### Unknown check names
|
||
|
||
Requesting an unknown check name (e.g. `check=bogus`) results in that check returning `passed: false`
|
||
with reason `"Unknown check: bogus"`, and the overall response is 503.
|
||
|
||
## How it works
|
||
|
||
### Node name discovery
|
||
|
||
The service reads the `NODE_NAME` environment variable, which must be injected via the Kubernetes Downward API:
|
||
|
||
```yaml
|
||
env:
|
||
- name: NODE_NAME
|
||
valueFrom:
|
||
fieldRef:
|
||
fieldPath: spec.nodeName
|
||
```
|
||
|
||
### Kubernetes API authentication
|
||
|
||
- Uses standard **in-cluster configuration** — no external credentials needed.
|
||
- The ServiceAccount token and CA certificate are automatically mounted at `/var/run/secrets/kubernetes.io/serviceaccount/`.
|
||
- Requires only minimal RBAC: `get` and `list` on the `nodes` resource (see `deploy/resources.yaml`).
|
||
- Connect and write timeouts are set to **1 second** to keep checks fast.
|
||
|
||
## Deploy
|
||
|
||
All Kubernetes resources (Namespace, ServiceAccount, ClusterRole, ClusterRoleBinding, and an OpenShift SCC RoleBinding for `hostnetwork`) are in a single file.
|
||
|
||
```bash
|
||
kubectl apply -f deploy/resources.yaml
|
||
kubectl apply -f deploy/daemonset.yaml
|
||
```
|
||
|
||
The DaemonSet uses `hostNetwork: true` and `hostPort: 25001`, so the endpoint is reachable directly on the node's IP at port 25001.
|
||
It tolerates all taints, ensuring it runs even on nodes marked unschedulable.
|
||
|
||
### Configure your external load balancer
|
||
|
||
**Example for HAProxy / OPNsense:**
|
||
- Check type: **HTTP**
|
||
- URI: `/health`
|
||
- Port: `25001` (configurable via `LISTEN_PORT` env var)
|
||
- Interval: 5–10 s
|
||
- Rise: 2
|
||
- Fall: 3
|
||
- Expect: `2xx`
|
||
|
||
## Endpoint usage
|
||
|
||
### Query parameter
|
||
|
||
Use the `check` query parameter to select which checks to run (comma-separated).
|
||
When omitted, only `node_ready` runs.
|
||
|
||
| Request | Checks run |
|
||
|------------------------------------------------|-----------------------------------|
|
||
| `GET /health` | `node_ready` |
|
||
| `GET /health?check=okd_router_1936` | `okd_router_1936` only |
|
||
| `GET /health?check=node_ready,okd_router_1936` | `node_ready` and `okd_router_1936`|
|
||
|
||
> **Note:** specifying `check=` replaces the default. Include `node_ready` explicitly if you need it alongside other checks.
|
||
|
||
### Response format
|
||
|
||
```json
|
||
{
|
||
"status": "ready" | "not-ready",
|
||
"checks": [
|
||
{
|
||
"name": "<check-name>",
|
||
"passed": true | false,
|
||
"reason": "<failure reason, omitted on success>",
|
||
"duration_ms": 42
|
||
}
|
||
],
|
||
"total_duration_ms": 42
|
||
}
|
||
```
|
||
|
||
**Healthy node (default)**
|
||
```http
|
||
HTTP/1.1 200 OK
|
||
|
||
{
|
||
"status": "ready",
|
||
"checks": [{ "name": "node_ready", "passed": true, "duration_ms": 42 }],
|
||
"total_duration_ms": 42
|
||
}
|
||
```
|
||
|
||
**Unhealthy node**
|
||
```http
|
||
HTTP/1.1 503 Service Unavailable
|
||
|
||
{
|
||
"status": "not-ready",
|
||
"checks": [
|
||
{ "name": "node_ready", "passed": false, "reason": "KubeletNotReady", "duration_ms": 35 }
|
||
],
|
||
"total_duration_ms": 35
|
||
}
|
||
```
|
||
|
||
**API server unreachable (fail-open)**
|
||
```http
|
||
HTTP/1.1 200 OK
|
||
|
||
{
|
||
"status": "ready",
|
||
"checks": [{ "name": "node_ready", "passed": true, "duration_ms": 1001 }],
|
||
"total_duration_ms": 1001
|
||
}
|
||
```
|
||
*(A warning is logged: `Kubernetes API appears to be down … Assuming node is ready.`)*
|
||
|
||
## Configuration
|
||
|
||
| Env var | Default | Description |
|
||
|---------------|----------|--------------------------------------|
|
||
| `NODE_NAME` | required | Node name, injected via Downward API |
|
||
| `LISTEN_PORT` | `25001` | TCP port the HTTP server binds to |
|
||
| `RUST_LOG` | — | Log level (e.g. `info`, `debug`) |
|
||
|
||
## Development
|
||
|
||
```bash
|
||
# Run locally
|
||
NODE_NAME=my-test-node cargo run
|
||
|
||
# Run tests
|
||
cargo test
|
||
```
|
||
|
||
---
|
||
|
||
*Minimal, auditable, and built for production bare-metal Kubernetes environments.*
|