docs/guides/writing-a-score.md: - Add Design Principles section: capabilities are industry concepts not tools, Scores encapsulate operational complexity, idempotency rules, no execution order dependencies CLAUDE.md: - Add Capability and Score Design Rules section with the swap test: if swapping the underlying tool breaks Scores, the capability boundary is wrong
147 lines
8.9 KiB
Markdown
147 lines
8.9 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Build & Test Commands
|
|
|
|
```bash
|
|
# Full CI check (check + fmt + clippy + test)
|
|
./build/check.sh
|
|
|
|
# Individual commands
|
|
cargo check --all-targets --all-features --keep-going
|
|
cargo fmt --check # Check formatting
|
|
cargo clippy # Lint
|
|
cargo test # Run all tests
|
|
|
|
# Run a single test
|
|
cargo test -p <crate_name> <test_name>
|
|
|
|
# Run a specific example
|
|
cargo run -p <example_crate_name>
|
|
|
|
# Build the mdbook documentation
|
|
mdbook build
|
|
```
|
|
|
|
## What Harmony Is
|
|
|
|
Harmony is the orchestration framework powering NationTech's vision of **decentralized micro datacenters** — small computing clusters deployed in homes, offices, and community spaces instead of hyperscaler facilities. The goal: make computing cleaner, more resilient, locally beneficial, and resistant to centralized points of failure (including geopolitical threats).
|
|
|
|
Harmony exists because existing IaC tools (Terraform, Ansible, Helm) are trapped in a **YAML mud pit**: static configuration files validated only at runtime, fragmented across tools, with errors surfacing at 3 AM instead of at compile time. Harmony replaces this entire class of tools with a single Rust codebase where **the compiler catches infrastructure misconfigurations before anything is deployed**.
|
|
|
|
This is not a wrapper around existing tools. It is a paradigm shift: infrastructure-as-real-code with compile-time safety guarantees that no YAML/HCL/DSL-based tool can provide.
|
|
|
|
## The Score-Topology-Interpret Pattern
|
|
|
|
This is the core design pattern. Understand it before touching the codebase.
|
|
|
|
**Score** — declarative desired state. A Rust struct generic over `T: Topology` that describes *what* you want (e.g., "a PostgreSQL cluster", "DNS records for these hosts"). Scores are serializable, cloneable, idempotent.
|
|
|
|
**Topology** — infrastructure capabilities. Represents *where* things run and *what the environment can do*. Exposes capabilities as traits (`DnsServer`, `K8sclient`, `HelmCommand`, `LoadBalancer`, `Firewall`, etc.). Examples: `K8sAnywhereTopology` (local K3D or any K8s cluster), `HAClusterTopology` (bare-metal HA with redundant firewalls/switches).
|
|
|
|
**Interpret** — execution glue. Translates a Score into concrete operations against a Topology's capabilities. Returns an `Outcome` (SUCCESS, NOOP, FAILURE, RUNNING, QUEUED, BLOCKED).
|
|
|
|
**The key insight — compile-time safety through trait bounds:**
|
|
```rust
|
|
impl<T: Topology + DnsServer + DhcpServer> Score<T> for DnsScore { ... }
|
|
```
|
|
The compiler rejects any attempt to use `DnsScore` with a Topology that doesn't implement `DnsServer` and `DhcpServer`. Invalid infrastructure configurations become compilation errors, not runtime surprises.
|
|
|
|
**Higher-order topologies** compose transparently:
|
|
- `FailoverTopology<T>` — primary/replica orchestration
|
|
- `DecentralizedTopology<T>` — multi-site coordination
|
|
|
|
If `T: PostgreSQL`, then `FailoverTopology<T>: PostgreSQL` automatically via blanket impls. Zero boilerplate.
|
|
|
|
## Architecture (Hexagonal)
|
|
|
|
```
|
|
harmony/src/
|
|
├── domain/ # Core domain — the heart of the framework
|
|
│ ├── score.rs # Score trait (desired state)
|
|
│ ├── topology/ # Topology trait + implementations
|
|
│ ├── interpret/ # Interpret trait + InterpretName enum (25+ variants)
|
|
│ ├── inventory/ # Physical infrastructure metadata (hosts, switches, mgmt interfaces)
|
|
│ ├── executors/ # Executor trait definitions
|
|
│ └── maestro/ # Orchestration engine (registers scores, manages topology state, executes)
|
|
├── infra/ # Infrastructure adapters (driven ports)
|
|
│ ├── opnsense/ # OPNsense firewall adapter
|
|
│ ├── brocade.rs # Brocade switch adapter
|
|
│ ├── kube.rs # Kubernetes executor
|
|
│ └── sqlx.rs # Database executor
|
|
└── modules/ # Concrete deployment modules (23+)
|
|
├── k8s/ # Kubernetes (namespaces, deployments, ingress)
|
|
├── postgresql/ # CloudNativePG clusters + multi-site failover
|
|
├── okd/ # OpenShift bare-metal from scratch
|
|
├── helm/ # Helm chart inflation → vanilla K8s YAML
|
|
├── opnsense/ # OPNsense (DHCP, DNS, etc.)
|
|
├── monitoring/ # Prometheus, Alertmanager, Grafana
|
|
├── kvm/ # KVM virtual machine management
|
|
├── network/ # Network services (iPXE, TFTP, bonds)
|
|
└── ...
|
|
```
|
|
|
|
Domain types to know: `Inventory` (read-only physical infra context), `Maestro<T>` (orchestrator — calls `topology.ensure_ready()` then executes scores), `Outcome` / `InterpretError` (execution results).
|
|
|
|
## Key Crates
|
|
|
|
| Crate | Purpose |
|
|
|---|---|
|
|
| `harmony` | Core framework: domain, infra adapters, deployment modules |
|
|
| `harmony_cli` | CLI + optional TUI (`--features tui`) |
|
|
| `harmony_config` | Unified config+secret management (env → SQLite → OpenBao → interactive prompt) |
|
|
| `harmony_secret` / `harmony_secret_derive` | Secret backends (LocalFile, OpenBao, Infisical) |
|
|
| `harmony_execution` | Execution engine |
|
|
| `harmony_agent` / `harmony_inventory_agent` | Persistent agent framework (NATS JetStream mesh), hardware discovery |
|
|
| `harmony_assets` | Asset management (URLs, local cache, S3) |
|
|
| `harmony_composer` | Infrastructure composition tool |
|
|
| `harmony-k8s` | Kubernetes utilities |
|
|
| `k3d` | Local K3D cluster management |
|
|
| `brocade` | Brocade network switch integration |
|
|
|
|
## OPNsense Crates
|
|
|
|
The `opnsense-codegen` and `opnsense-api` crates exist because OPNsense's automation ecosystem is poor — no typed API client exists. These are support crates, not the core of Harmony.
|
|
|
|
- `opnsense-codegen`: XML model files → IR → Rust structs with serde helpers for OPNsense wire format quirks (`opn_bool` for "0"/"1" strings, `opn_u16`/`opn_u32` for string-encoded numbers). Vendor sources are git submodules under `opnsense-codegen/vendor/`.
|
|
- `opnsense-api`: Hand-written `OpnsenseClient` + generated model types in `src/generated/`.
|
|
|
|
## Key Design Decisions (ADRs in docs/adr/)
|
|
|
|
- **ADR-001**: Rust chosen for type system, refactoring safety, and performance
|
|
- **ADR-002**: Hexagonal architecture — domain isolated from adapters
|
|
- **ADR-003**: Infrastructure abstractions at domain level, not provider level (no vendor lock-in)
|
|
- **ADR-005**: Custom Rust DSL over YAML/Score-spec — real language, Cargo deps, composable
|
|
- **ADR-007**: K3D as default runtime (K8s-certified, lightweight, cross-platform)
|
|
- **ADR-009**: Helm charts inflated to vanilla K8s YAML, then deployed via existing code paths
|
|
- **ADR-015**: Higher-order topologies via blanket trait impls (zero-cost composition)
|
|
- **ADR-016**: Agent-based architecture with NATS JetStream for real-time failover and distributed consensus
|
|
- **ADR-020**: Unified config+secret management — Rust struct is the schema, resolution chain: env → store → prompt
|
|
|
|
## Capability and Score Design Rules
|
|
|
|
**Capabilities are industry concepts, not tools.** A capability trait represents a standard infrastructure need (e.g., `DnsServer`, `LoadBalancer`, `Router`, `CertificateManagement`) that can be fulfilled by different products. OPNsense provides `DnsServer` today; CoreDNS or Route53 could provide it tomorrow. Scores must not break when the backend changes.
|
|
|
|
**Exception:** When the developer fundamentally needs to know the implementation. `PostgreSQL` is a capability (not `Database`) because the developer writes PostgreSQL-specific SQL and replication configs. Swapping to MariaDB would break the application, not just the infrastructure.
|
|
|
|
**Test:** If you could swap the underlying tool without rewriting any Score that uses the capability, the boundary is correct.
|
|
|
|
**Don't name capabilities after tools.** `SecretVault` not `OpenbaoStore`. `IdentityProvider` not `ZitadelAuth`. Think: what is the core developer need that leads to using this tool?
|
|
|
|
**Scores encapsulate operational complexity.** Move procedural knowledge (init sequences, retry logic, distribution-specific config) into Scores. A high-level example should be ~15 lines, not ~400 lines of imperative orchestration.
|
|
|
|
**Scores must be idempotent.** Running twice = same result as once. Use create-or-update, handle "already exists" gracefully.
|
|
|
|
**Scores must not depend on execution order.** Declare capability requirements via trait bounds, don't assume another Score ran first. If Score B needs what Score A provides, Score B should declare that capability as a trait bound.
|
|
|
|
See `docs/guides/writing-a-score.md` for the full guide.
|
|
|
|
## Conventions
|
|
|
|
- **Rust edition 2024**, resolver v2
|
|
- **Conventional commits**: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
|
- **Small PRs**: max ~200 lines (excluding generated code), single-purpose
|
|
- **License**: GNU AGPL v3
|
|
- **Quality bar**: This framework demands high-quality engineering. The type system is a feature, not a burden. Leverage it. Prefer compile-time guarantees over runtime checks. Abstractions should be domain-level, not provider-specific.
|