Files
harmony/docs/ARCHITECTURE.md
2026-05-20 12:03:19 -04:00

128 lines
5.6 KiB
Markdown

# Architecture
Starting point for a human-readable architecture overview of Harmony.
The `docs/` directory has multiple overlapping documents
(`concepts.md`, `architecture-challenges.md`, `cyborg-metaphor.md`,
the `concepts/` subdirectory, ADRs under `docs/adr/`, the in-repo
`CLAUDE.md`). Cohesion work is scheduled for a follow-up PR — this
file is the new front door and the placeholder that work will
build from.
## What Harmony is
An orchestration framework for **decentralized micro datacenters**:
small computing clusters deployed in homes, offices, and community
spaces instead of hyperscaler facilities. The framework's goal is
to make infrastructure-as-code **compile-time-safe** — invalid
configurations become Rust compile errors, not 3AM YAML
surprises.
Not a wrapper around existing tools. A single Rust codebase that
replaces Terraform/Ansible/Helm in its target domain by making the
Rust type system the configuration language.
## The framework primer
The Score-Topology-Interpret pattern, the hexagonal architecture,
the module layout, and the conventions are all documented in
`CLAUDE.md` at the repo root (also available as `AGENTS.md`). That
file is kept current as the canonical entry point. Read it first.
Key ADRs that lock the foundational decisions:
- **ADR-001** — Rust chosen for type system + refactoring safety.
- **ADR-002** — Hexagonal architecture; domain isolated from
adapters.
- **ADR-003** — Infrastructure abstractions at the domain level,
not the provider level (no vendor lock-in).
- **ADR-005** — Real Rust DSL over YAML/HCL.
- **ADR-009** — Helm charts inflate into vanilla K8s YAML and
flow through the Score pipeline.
- **ADR-015** — Higher-order topologies via blanket trait impls.
- **ADR-016** — Agent-based architecture with NATS JetStream for
the global mesh.
- **ADR-020** — Unified config + secret management.
- **ADR-023** — Deploy architecture: Scores everywhere (including
tests), per-app `*-deploy` crates, deploy blocks on smoke-test,
topologies are compile-time.
The full ADR set lives under `docs/adr/`.
## Why Harmony (the framework choice)
Three load-bearing reasons that shape every other decision:
1. **The compiler is the validator.** Existing IaC tools validate
at runtime, after a deploy has already been kicked off. Harmony
validates at `cargo check`. The cost of a bad configuration
drops from "1 AM page" to "red squiggle in your editor."
2. **Decentralized by design.** The target deployment surface is
thousands of small clusters in homes, offices, partner sites,
field-deployed devices — not three hyperscaler regions. The
framework's primitives reflect that: topologies are
parameterized over physical placement, the agent mesh is
NATS-based with strict-ordered supercluster semantics, and the
capability traits never assume a centralized control plane.
3. **The team is its own largest customer.** NationTech runs
multiple OKD clusters already and uses Harmony to manage them.
Every dogfooded primitive is a primitive that's been pressure-
tested against real operational pain before it ships to
external customers.
## Why custom over k3s + ArgoCD (the fleet-platform choice)
A specific instance of the framework-choice reasoning, decided
during the fleet platform v0 work:
- **End-customer engineers are mechanical / electrical /
chemical, not Kubernetes-literate.** A k3s device forces them
to learn `kubectl` / CRDs / CNI. A single Rust binary plus
`podman` is debuggable with `systemctl`, `journalctl`, `ps`
tools they already use daily.
- **The platform bet is strategic, not technical.** Building a
custom platform on the "no vendor lock-in, decentralized,
open-source" positioning differentiates NationTech as a platform
company; an ArgoCD-on-k3s integration positions it as an
integration shop on someone else's runtime.
- **NATS is a coordination fabric, not a queue.** Federation
across regions, strict ordering across the supercluster, and
the "operator in multiple clusters, deployments coming from
everywhere" topology all depend on this choice. ArgoCD doesn't
federate naturally; that's a fundamental shape problem, not a
feature gap.
- **Harmony's daemon-mode `Score::interpret()` is already
production**, running CNPG PostgreSQL failover today via
`harmony_agent`. The fleet agent is the same pattern at a
smaller scale.
## Decision hierarchy when contributing
When the framework is silent on a question, resolve in this
order:
1. **Does this preserve the compile-time-safety guarantee?** If
the answer involves "we'll validate it at runtime," reach for
a type instead.
2. **Does this preserve a capability boundary?** Capability traits
(`DnsServer`, `LoadBalancer`, `IdentityProvider`, …) are the
seam between domain and adapters. If unsure, favor the
boundary.
3. **Is this in the smallest possible PR?** Two ~200-line PRs
beat one 400-line PR. ADR-002 placement and convention rules
live in `CLAUDE.md`.
4. **Would this introduce a string where a type would do?** Pull
the type. The `ScoreEnvelope` mistake (a string-wrapped
discriminator that re-implemented `serde` tagged enums by
hand) is the canonical anti-pattern.
5. **Is this aligned with the existing module layout?** Use the
existing patterns (`*-deploy` crates per ADR-023,
`harmony/src/modules/<thing>/` for framework primitives).
Don't invent placement; ask if you can't fit the change into
the current shape.
If after all of the above the answer is still unclear, surface
the question in a small ADR draft under `docs/adr/drafts/`
rather than guessing in code.