Some checks failed
Run Check Script / check (pull_request) Failing after 1m52s
128 lines
5.6 KiB
Markdown
128 lines
5.6 KiB
Markdown
# Architecture
|
|
|
|
Starting point for a human-readable architecture overview of Harmony.
|
|
The `docs/` directory has multiple overlapping documents
|
|
(`concepts.md`, `architecture-challenges.md`, `cyborg-metaphor.md`,
|
|
the `concepts/` subdirectory, ADRs under `docs/adr/`, the in-repo
|
|
`CLAUDE.md`). Cohesion work is scheduled for a follow-up PR — this
|
|
file is the new front door and the placeholder that work will
|
|
build from.
|
|
|
|
## What Harmony is
|
|
|
|
An orchestration framework for **decentralized micro datacenters**:
|
|
small computing clusters deployed in homes, offices, and community
|
|
spaces instead of hyperscaler facilities. The framework's goal is
|
|
to make infrastructure-as-code **compile-time-safe** — invalid
|
|
configurations become Rust compile errors, not 3AM YAML
|
|
surprises.
|
|
|
|
Not a wrapper around existing tools. A single Rust codebase that
|
|
replaces Terraform/Ansible/Helm in its target domain by making the
|
|
Rust type system the configuration language.
|
|
|
|
## The framework primer
|
|
|
|
The Score-Topology-Interpret pattern, the hexagonal architecture,
|
|
the module layout, and the conventions are all documented in
|
|
`CLAUDE.md` at the repo root (also available as `AGENTS.md`). That
|
|
file is kept current as the canonical entry point. Read it first.
|
|
|
|
Key ADRs that lock the foundational decisions:
|
|
|
|
- **ADR-001** — Rust chosen for type system + refactoring safety.
|
|
- **ADR-002** — Hexagonal architecture; domain isolated from
|
|
adapters.
|
|
- **ADR-003** — Infrastructure abstractions at the domain level,
|
|
not the provider level (no vendor lock-in).
|
|
- **ADR-005** — Real Rust DSL over YAML/HCL.
|
|
- **ADR-009** — Helm charts inflate into vanilla K8s YAML and
|
|
flow through the Score pipeline.
|
|
- **ADR-015** — Higher-order topologies via blanket trait impls.
|
|
- **ADR-016** — Agent-based architecture with NATS JetStream for
|
|
the global mesh.
|
|
- **ADR-020** — Unified config + secret management.
|
|
- **ADR-023** — Deploy architecture: Scores everywhere (including
|
|
tests), per-app `*-deploy` crates, deploy blocks on smoke-test,
|
|
topologies are compile-time.
|
|
|
|
The full ADR set lives under `docs/adr/`.
|
|
|
|
## Why Harmony (the framework choice)
|
|
|
|
Three load-bearing reasons that shape every other decision:
|
|
|
|
1. **The compiler is the validator.** Existing IaC tools validate
|
|
at runtime, after a deploy has already been kicked off. Harmony
|
|
validates at `cargo check`. The cost of a bad configuration
|
|
drops from "1 AM page" to "red squiggle in your editor."
|
|
|
|
2. **Decentralized by design.** The target deployment surface is
|
|
thousands of small clusters in homes, offices, partner sites,
|
|
field-deployed devices — not three hyperscaler regions. The
|
|
framework's primitives reflect that: topologies are
|
|
parameterized over physical placement, the agent mesh is
|
|
NATS-based with strict-ordered supercluster semantics, and the
|
|
capability traits never assume a centralized control plane.
|
|
|
|
3. **The team is its own largest customer.** NationTech runs
|
|
multiple OKD clusters already and uses Harmony to manage them.
|
|
Every dogfooded primitive is a primitive that's been pressure-
|
|
tested against real operational pain before it ships to
|
|
external customers.
|
|
|
|
## Why custom over k3s + ArgoCD (the fleet-platform choice)
|
|
|
|
A specific instance of the framework-choice reasoning, decided
|
|
during the fleet platform v0 work:
|
|
|
|
- **End-customer engineers are mechanical / electrical /
|
|
chemical, not Kubernetes-literate.** A k3s device forces them
|
|
to learn `kubectl` / CRDs / CNI. A single Rust binary plus
|
|
`podman` is debuggable with `systemctl`, `journalctl`, `ps` —
|
|
tools they already use daily.
|
|
- **The platform bet is strategic, not technical.** Building a
|
|
custom platform on the "no vendor lock-in, decentralized,
|
|
open-source" positioning differentiates NationTech as a platform
|
|
company; an ArgoCD-on-k3s integration positions it as an
|
|
integration shop on someone else's runtime.
|
|
- **NATS is a coordination fabric, not a queue.** Federation
|
|
across regions, strict ordering across the supercluster, and
|
|
the "operator in multiple clusters, deployments coming from
|
|
everywhere" topology all depend on this choice. ArgoCD doesn't
|
|
federate naturally; that's a fundamental shape problem, not a
|
|
feature gap.
|
|
- **Harmony's daemon-mode `Score::interpret()` is already
|
|
production**, running CNPG PostgreSQL failover today via
|
|
`harmony_agent`. The fleet agent is the same pattern at a
|
|
smaller scale.
|
|
|
|
## Decision hierarchy when contributing
|
|
|
|
When the framework is silent on a question, resolve in this
|
|
order:
|
|
|
|
1. **Does this preserve the compile-time-safety guarantee?** If
|
|
the answer involves "we'll validate it at runtime," reach for
|
|
a type instead.
|
|
2. **Does this preserve a capability boundary?** Capability traits
|
|
(`DnsServer`, `LoadBalancer`, `IdentityProvider`, …) are the
|
|
seam between domain and adapters. If unsure, favor the
|
|
boundary.
|
|
3. **Is this in the smallest possible PR?** Two ~200-line PRs
|
|
beat one 400-line PR. ADR-002 placement and convention rules
|
|
live in `CLAUDE.md`.
|
|
4. **Would this introduce a string where a type would do?** Pull
|
|
the type. The `ScoreEnvelope` mistake (a string-wrapped
|
|
discriminator that re-implemented `serde` tagged enums by
|
|
hand) is the canonical anti-pattern.
|
|
5. **Is this aligned with the existing module layout?** Use the
|
|
existing patterns (`*-deploy` crates per ADR-023,
|
|
`harmony/src/modules/<thing>/` for framework primitives).
|
|
Don't invent placement; ask if you can't fit the change into
|
|
the current shape.
|
|
|
|
If after all of the above the answer is still unclear, surface
|
|
the question in a small ADR draft under `docs/adr/drafts/`
|
|
rather than guessing in code.
|