Compare commits

..

86 Commits

Author SHA1 Message Date
5abc1b217c wip fix load balancer idempotency
Some checks failed
Run Check Script / check (pull_request) Failing after 7s
2026-04-02 16:44:48 -04:00
a7d1abd0be wip fix load balancer idempotency
Some checks failed
Run Check Script / check (pull_request) Failing after 9s
2026-04-02 16:34:54 -04:00
92b46c5c08 fix: haproxy listens on 0.0.0.0 and opens wan in the firewall for okd deployment, also disables http redirect rule for opnsense webgui which stole haproxy traffic
Some checks failed
Run Check Script / check (pull_request) Failing after 10s
2026-04-01 22:31:54 -04:00
d937813fd4 ignore stress test config and db files
Some checks failed
Run Check Script / check (pull_request) Failing after 10s
2026-04-01 21:12:08 -04:00
8b5ca51fba chore: Improve opnsene constructor signature
Some checks failed
Run Check Script / check (pull_request) Failing after 11s
2026-04-01 21:10:31 -04:00
3afaa38ba0 feat: Network stress test utility that will randomly flap switch ports and reboot opnsense firewalls while running iperf and report statistics and events in a simple clean ui 2026-04-01 21:09:02 -04:00
1a0e754c7a chore: Note some problems, improve some variables naming around opnsense automation
Some checks failed
Run Check Script / check (pull_request) Failing after 18s
2026-03-31 17:33:04 -04:00
0dc9b80010 chore: fix unused import and add TODO/doc comments from review
Some checks failed
Run Check Script / check (pull_request) Failing after 11s
- Remove unused `warn` import in pair integration example
- Add TODO comment for shared credentials limitation (ROADMAP/11)
- Add doc comments on DhcpServer::get_ip/get_host noting they return
  primary's address, not the CARP VIP
2026-03-31 13:17:44 -04:00
6554ac5341 docs: fix pair integration subnet in diagram, add to examples index
- Fixed network topology diagram in pair README: 192.168.10.x -> 192.168.1.x
  to match the actual code (OPNsense boots on .1 of 192.168.1.0/24)
- Added explanation of NIC juggling to the diagram section
- Updated single-VM "What's next" to link to pair example (was "in progress")
- Added opnsense_pair_integration to examples/README.md table and category
2026-03-31 12:29:35 -04:00
811c56086c fix(kvm): fix domiflist MAC parsing and pair test subnet
- Fixed VmInterface parsing: virsh domiflist has 5 columns (Interface,
  Type, Source, Model, MAC), not 4. MAC is at index 4, not 3.
- Changed pair integration subnet to 192.168.1.0/24 to match OPNsense's
  hard-coded default boot IP of .1.

Tested: full --full pair integration passes end-to-end with CARP VIP
configured on both firewalls (primary advskew=0, backup advskew=100).
2026-03-31 12:26:34 -04:00
34d02d7291 feat(opnsense): add firewall pair VM integration example
Boots two OPNsense VMs, bootstraps both with NIC juggling to handle
the .1 IP conflict, then applies FirewallPairTopology with CarpVipScore.

The bootstrap sequence:
1. Boot both VMs on shared LAN bridge
2. Disable backup's LAN NIC
3. Bootstrap primary on .1, change IP to .2
4. Swap NICs (disable primary, enable backup)
5. Bootstrap backup on .1, change IP to .3
6. Re-enable all NICs
7. Apply pair scores (CARP VIP, VLANs, firewall rules)
8. Verify via API on both firewalls

Supports --full flag for single-shot CI execution.
2026-03-31 12:07:40 -04:00
73785e7336 feat(kvm): add NIC link control for VM interface management
Adds set_interface_link() and list_interfaces() to KvmExecutor,
enabling programmatic up/down control of VM network interfaces by
MAC address.

This is essential for bootstrapping multiple VMs that boot with the
same default IP (e.g., OPNsense on 192.168.1.1) — disable all LAN
NICs, then enable and bootstrap one at a time.

Uses virsh domif-setlink and domiflist under the hood. Tested against
a live KVM VM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 12:02:09 -04:00
8abcb68865 docs: update OPNsense VM integration for fully automated bootstrap
Major rewrite of OPNsense documentation to reflect the new unattended
workflow — no manual browser interaction required.

- Rewrote examples/opnsense_vm_integration/README.md: highlights --full
  CI mode, documents OPNsenseBootstrap automated steps, lists system
  requirements by distro
- Rewrote docs/use-cases/opnsense-vm-integration.md: removed manual
  Step 3 (SSH/webgui), added Phase 2 bootstrap description, updated
  architecture diagram with OPNsenseBootstrap layer
- Added OPNsense VM Integration to docs/README.md (was missing)
- Added OPNsense VM Integration to docs/use-cases/README.md (was missing)
- Added opnsense_vm_integration to examples/README.md quick reference
  table and Infrastructure category (was missing, marked as recommended)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 10:16:42 -04:00
35aab0ecfb fix(opnsense): fix bootstrap webgui port change and add SSH diagnostics
Fixes:
- CSRF token parser now extracts <input> tags individually instead of
  parsing whole lines, fixing the bug where <form name="iform"> on the
  same line as the CSRF hidden input caused the wrong name to be extracted
- extract_selected_option() for <select> dropdowns (webguiproto,
  ssl-certref) which extract_input_value() couldn't handle
- After webgui port change, explicitly restart lighttpd via SSH
  (configctl webgui restart) as a safety net — the PHP configd call
  can fail if lighttpd dies before executing it

Adds:
- diagnose_via_ssh() reports webgui config, listening ports, lighttpd
  process status, and configctl status — invaluable for troubleshooting
- Diagnostic output is shown automatically when wait_for_ready() fails

Tested: full --boot + integration test passes end-to-end with zero
manual interaction on a fresh OPNsense 26.1 VM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 08:43:44 -04:00
ddab4d27eb feat(opnsense): integrate OPNsenseBootstrap into VM integration example
Replaces the manual browser steps (wizard, SSH, webgui port) with
automated OPNsenseBootstrap calls. Adds --full flag for CI-friendly
single-shot boot + test.

Working: login, wizard abort, SSH enable with root+password auth.
In progress: webgui port change (lighttpd falls back to port 80 —
needs fix for <select> dropdown extraction and CSRF token refresh).

Also adds:
- diagnose_via_ssh() for troubleshooting webgui status
- restart_webgui_via_ssh() safety net after port changes
- CSRF parser fix for same-line form+input HTML (real OPNsense layout)
- cookie_store(true) for reliable session management

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:59:04 -04:00
79d8aa39fc feat(opnsense): add OPNsenseBootstrap for unattended first-boot setup
Automates OPNsense initial setup via HTTP session authentication,
eliminating manual browser interaction. The module:

- Logs in with username/password (handles CSRF token extraction)
- Aborts the initial setup wizard via /api/core/initial_setup/abort
- Enables SSH with root login and password auth
- Changes the web GUI port (fire-and-forget, handles server restart)
- Provides wait_for_ready() polling helper

Uses reqwest with cookie jar for session management. No browser or
external dependencies needed — pure Rust HTTP client approach.

Includes unit tests for CSRF token extraction and HTML parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 06:09:30 -04:00
5e8e63ade7 test(opnsense): add unit tests for FirewallPairTopology
Tests cover:
- ensure_ready outcome merging (both Success)
- CarpVipScore applies VIPs to both firewalls with correct advskew
- CarpVipScore custom backup_advskew is respected
- CarpVipScore defaults backup_advskew to 100 when unset
- VlanScore uniform delegation applies to both firewalls

Uses httptest mock HTTP servers to intercept OPNsense API calls
without requiring real firewall devices. Adds httptest dev-dependency
to harmony crate and a #[cfg(test)] from_config constructor on
OPNSenseFirewall for test-friendly instantiation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 05:27:24 -04:00
cb2a650d8b feat(opnsense): add FirewallPairTopology for HA firewall pair management
Introduces a higher-order topology that wraps two OPNSenseFirewall
instances (primary + backup) and orchestrates score application across
both. CARP VIPs get differentiated advskew values (primary=0,
backup=configurable) while all other scores apply identically to both
firewalls.

Includes CarpVipScore, DhcpServer delegation, pair Score impls for all
existing OPNsense scores, and opnsense_from_config() factory method.

Also adds ROADMAP entries for generic firewall trait (10), delegation
macro, integration tests, and named config instances (11).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 05:12:03 -04:00
466a8aafd1 feat(postgresql): add wait_for_ready option to PostgreSQLConfig
Some checks failed
Run Check Script / check (pull_request) Failing after 12s
Add wait_for_ready field (default: true) to PostgreSQLConfig. When
enabled, K8sPostgreSQLInterpret waits for the cluster's -rw service
to exist after applying the Cluster CR, ensuring callers like
get_endpoint() succeed immediately.

This eliminates the retry loop in the harmony_sso example's
deploy_zitadel() -- ZitadelScore now deploys in a single pass because
the PG service is guaranteed to exist before Zitadel's Helm chart
init job tries to connect.

The deploy_zitadel function shrinks from a 5-attempt retry loop to a
simple score.interpret() call.
2026-03-30 08:45:49 -04:00
fabec7ac11 refactor: extract CoreDNSRewriteScore from harmony_sso example
Some checks failed
Run Check Script / check (pull_request) Failing after 11m35s
Move CoreDNS rewrite logic into a reusable Score at
harmony/src/modules/k8s/coredns.rs. The Score patches CoreDNS on
K3sFamily clusters to add name rewrite rules (e.g., mapping
sso.harmony.local to the in-cluster service FQDN).

K3sFamily/Default only, no-op on OpenShift. Idempotent.

The harmony_sso example now uses CoreDNSRewriteScore.interpret()
instead of an inline function.
2026-03-30 07:56:44 -04:00
3e2b8423e8 chore: clean up clippy warnings in zitadel and openbao modules
- Remove unused serde default functions in ZitadelSetupScore
- Replace redundant closures with function references (InterpretError::new)
- Allow dead_code on AppSearchEntry.id (needed for deserialization)
- Fix empty line after doc comment in ZitadelScore
- Remove unneeded return statement in generate_secure_password
2026-03-30 07:46:47 -04:00
c687d4e6b3 docs: add Phase 9 (SSO + Config Hardening) to roadmap
New roadmap phase covering the hardening path for the SSO config
management stack: builder pattern for OpenbaoSecretStore, ZitadelScore
PG readiness fix, CoreDNSRewriteScore, integration tests, and future
capability traits.

Updates current state to reflect implemented Zitadel OIDC integration
and harmony_sso example.
2026-03-30 07:37:24 -04:00
cd48675027 chore: cargo fmt 2026-03-29 09:01:00 -04:00
8cb59cc029 fix: SSO end-to-end fixes for device flow
- OpenbaoSetupScore: verify vault init state before trusting cached
  keys (handles cluster recreation with stale local keys file)
- ZitadelSetupScore: trim PAT whitespace (K8s secret had trailing
  newline that corrupted the Authorization header)
- ZitadelOidcAuth: resolve SSO hostname to 127.0.0.1 via reqwest
  resolve() so device flow works without /etc/hosts entries
- Fix OIDC discovery URL to include port (Zitadel issuer is
  http://sso.harmony.local:8080, not http://sso.harmony.local)

The full SSO flow now works end-to-end: deploy, provision identity,
configure JWT auth, trigger device flow. User sees verification URL
and code in the terminal.
2026-03-29 08:54:28 -04:00
772fcad3d7 refactor(harmony-sso): full SSO flow as default deployment
The example now deploys the complete SSO stack and uses it:

Phase 1: Deploy OpenBao + basic setup (init, unseal, policies, users)
Phase 2: CoreDNS patch + Deploy Zitadel + ZitadelSetupScore (creates
  project + device-code app) + OpenBao JWT auth (with real client_id)
Phase 3: Store config via SSO-authenticated OpenBao (triggers device
  flow on first run, uses cached session on re-run)

Removed --demo and --sso-demo flags. The default run IS the demo.
Kept --skip-zitadel and --cleanup.

On re-run: all deployments are idempotent, cached OIDC session is
reused, config is loaded from OpenBao without login prompt.
2026-03-29 08:37:25 -04:00
80e512caf7 feat(harmony-secret): implement JWT exchange for Zitadel OIDC -> OpenBao
Fix the core SSO authentication flow: instead of storing the Zitadel
access_token as the OpenBao token (which OpenBao doesn't recognize),
exchange the id_token with OpenBao's JWT auth method via
POST /v1/auth/{mount}/login to get a real OpenBao client token.

Changes:
- ZitadelOidcAuth: add openbao_url, jwt_auth_mount, jwt_role fields
- New exchange_jwt_for_openbao_token() method using reqwest (vaultrs
  0.7.4 has no JWT auth module)
- process_token_response() now exchanges id_token when openbao_url is
  set, falls back to access_token for backward compat
- OpenbaoSecretStore::new() accepts optional jwt_role + jwt_auth_mount
- All callers updated (lib.rs, openbao_chain example, harmony_sso)

This implements ADR 020-1 Step 6 (OpenBao JWT exchange).
2026-03-29 08:35:43 -04:00
d0b7c03e12 feat(zitadel): add ZitadelSetupScore for identity provisioning
New Score that provisions identity resources in a deployed Zitadel
instance via the Management API v1:
- Create projects
- Create OIDC applications (device-code grant for CLI/headless)
- Machine user provisioning (stubbed for future iteration)

Authenticates using the admin PAT from the iam-admin-pat K8s secret
(provisioned automatically by the Zitadel Helm chart). No password
extraction or deprecated grant types needed.

All operations are idempotent: checks for existing resources before
creating. Results cached at ~/.local/share/harmony/zitadel/client-config.json.

This is the "day two" counterpart to ZitadelScore, enabling enterprise
automation of identity management (users, machines, applications, groups).
2026-03-29 08:31:49 -04:00
4a66880a84 fix(harmony-k8s): make API discovery cache invalidatable
Replace OnceCell<Discovery> with RwLock<Option<Arc<Discovery>>> so the
cache can be cleared after installing CRDs or operators that register
new API groups.

Add invalidate_discovery() method. Call it in ensure_cnpg_operator()
after confirming the Cluster CRD is registered, so the subsequent
apply() call sees the new CRD without needing a fresh client.

This eliminates the "Cannot resolve GVK" retry loop -- PostgreSQL
Cluster resources now apply on the first attempt after CNPG operator
installation.
2026-03-29 07:30:33 -04:00
ec1bdbab73 feat(harmony-sso): add CoreDNS rewrite for in-cluster hostname resolution
Patch CoreDNS on K3sFamily to add rewrite rules that map external
hostnames (sso.harmony.local, bao.harmony.local) to cluster service
FQDNs. This allows OpenBao's JWT auth to fetch Zitadel's JWKS from
inside the cluster, where Zitadel validates Host headers against its
ExternalDomain.

Uses apply_dynamic with force_conflicts since the CoreDNS ConfigMap
is owned by the k3d deployer. Restarts CoreDNS pods after patching.
No-op on non-K3sFamily distributions (OpenShift, etc.).

Idempotent: skips patching if rewrite rules already present.
2026-03-29 07:22:54 -04:00
09b704e9cf fix(postgresql): wait for CNPG CRD registration after operator install
The CNPG operator deployment being ready does not guarantee that the
Cluster CRD is registered in the API server's discovery cache. This
caused intermittent "Cannot resolve GVK: postgresql.cnpg.io/v1/Cluster"
errors when applying PostgreSQL Cluster resources immediately after
operator installation.

Add wait_for_crd() to harmony-k8s that polls has_crd() until the CRD
appears (2s interval, 60s timeout). Call it in ensure_cnpg_operator()
after the deployment readiness check.

This eliminates the need for retry loops in callers like harmony_sso.
2026-03-29 07:11:34 -04:00
8e3e935459 refactor(harmony-sso): use OpenbaoSetupScore instead of imperative orchestration
Replace ~200 lines of manual init/unseal/configure/jwt-auth code with
a single OpenbaoSetupScore invocation. The deployment path is now:

1. OpenbaoScore (Helm deploy)
2. OpenbaoSetupScore (init, unseal, policies, users, JWT auth)
3. ZitadelScore (CNPG + Helm, with retry)

The example main.rs goes from ~800 lines to ~370 lines. The removed
imperative logic now lives in the reusable OpenbaoSetupScore which can
be tested against any topology.
2026-03-29 06:45:36 -04:00
c388d5234f feat(openbao): add OpenbaoSetupScore for post-deployment lifecycle
New Score that handles the operational complexity of making a deployed
OpenBao instance operational:
- Init (operator init) with local key storage (~/.local/share/harmony/openbao/)
- Unseal (3 of 5 keys)
- Enable KV v2 secrets engine
- Create configurable policies (HCL)
- Enable userpass auth and create users
- Optional JWT auth configuration for OIDC integration

All steps are idempotent. Requires T: Topology + K8sclient.

This encapsulates the tribal knowledge of OpenBao lifecycle management
into a compiled, type-checked Score that can be tested against any
topology (k3d, OpenShift, kubeadm, bare metal).
2026-03-28 23:51:57 -04:00
d9d5ea718f docs: add Score design principles and capability architecture rules
docs/guides/writing-a-score.md:
- Add Design Principles section: capabilities are industry concepts not
  tools, Scores encapsulate operational complexity, idempotency rules,
  no execution order dependencies

CLAUDE.md:
- Add Capability and Score Design Rules section with the swap test:
  if swapping the underlying tool breaks Scores, the capability
  boundary is wrong
2026-03-28 23:48:12 -04:00
5415452f15 refactor(harmony-sso): replace kubectl with typed K8s APIs, add Zitadel deployment
Replace all Command::new("kubectl") calls with harmony-k8s K8sClient
methods:
- wait_for_pod_ready() instead of kubectl get pod jsonpath
- exec_pod_capture_output() for OpenBao init/unseal/configure
- delete_resource<MutatingWebhookConfiguration>() for webhook cleanup
- port_forward() instead of kubectl port-forward subprocess

Thread K3d and K8sClient through all functions instead of
reconstructing context strings. Consolidate path helpers into
harmony_data_dir().

Add Zitadel deployment via ZitadelScore with retry logic for CNPG CRD
registration race and PostgreSQL cluster readiness timing.

Add CLI flags: --demo, --sso-demo, --skip-zitadel, --cleanup.
Add --demo mode: ConfigManager with EnvSource + StoreSource<OpenbaoSecretStore>.
Configure OpenBao with harmony-dev policy, userpass auth, and JWT auth.
2026-03-28 23:48:00 -04:00
b05a341a80 feat(harmony-k8s, k3d): add exec_pod, delete_resource, port_forward, and k3d getters
harmony-k8s:
- exec_pod() and exec_pod_capture_output(): exec commands in pods by
  name (not just label), with proper stdout/stderr capture
- delete_resource<K>(): generic typed delete using ScopeResolver,
  idempotent (404 = Ok)
- port_forward(): native port forwarding via kube-rs Portforwarder +
  tokio TcpListener, replacing kubectl subprocess. Returns
  PortForwardHandle that auto-aborts on drop.

k3d:
- base_dir(), cluster_name(), context_name() public getters

Also adds tokio "net" feature to workspace for TcpListener.
2026-03-28 23:47:42 -04:00
d0252bf1dc wip: harmony_sso example deploying zitadel and openbao seems to be working for config backend!
Some checks failed
Run Check Script / check (pull_request) Failing after 15s
2026-03-28 18:20:01 -04:00
f33d730645 fix(opnsense): improve idempotency in VIP, LAGG, and firewall modules
VIP: Fix subnet matching from starts_with() to exact equality. Previously
"192.168.1.10" would wrongly match a request for "192.168.1.100".

LAGG: Add config diff detection when updating existing LAGGs. Logs a
warning with previous config when protocol, description, or MTU differs
from desired state.

Firewall: Detect duplicate rules with same description and warn. When
multiple rules share a description, updates the first one and logs a
warning suggesting unique descriptions.

7 new tests proving:
- VIP exact subnet match (rejects prefix match, finds exact, mode check)
- Firewall create/update/duplicate/different-description scenarios
2026-03-28 13:48:29 -04:00
6040e2394e add claude.md
Some checks failed
Run Check Script / check (pull_request) Failing after 16s
2026-03-28 13:33:04 -04:00
a7f9b1037a refactor: push harmony_types enums all the way down to opnsense-api
Some checks failed
Run Check Script / check (pull_request) Failing after 19s
Move vendor-neutral IaC enums to harmony_types::firewall. Add From impls
in opnsense-api::wire converting harmony_types to generated OPNsense
types. Add typed methods in opnsense-config that accept harmony_types
enums and handle wire conversion internally.

Score layer no longer builds serde_json::json!() bodies — it passes
harmony_types enums directly to opnsense-config typed methods:
  ensure_filter_rule(&FirewallAction, &Direction, &IpProtocol, ...)
  ensure_snat_rule_from(&IpProtocol, &NetworkProtocol, ...)
  ensure_dnat_rule(&IpProtocol, &NetworkProtocol, ...)
  ensure_vip_from(&VipMode, ...)
  ensure_lagg(..., &LaggProtocol, ...)

Type flow: harmony_types → Score → opnsense-config → From<> → generated → wire
No strings cross layer boundaries for typed fields.
2026-03-26 11:07:49 -04:00
b98b2aa3f7 refactor: move IaC enums to harmony_types, translate in opnsense-api
Move vendor-neutral firewall and network types (FirewallAction, Direction,
IpProtocol, NetworkProtocol, VipMode, LaggProtocol) from harmony Score
modules to harmony_types::firewall as industry-standard IaC types.

Display impls use human-readable names (IPv4, CARP, LACP) — not wire
format. OPNsense-specific wire translations live in opnsense-api::wire
via the ToOPNsenseValue trait ("inet", "carp", "lacp").

Dependency chain: harmony_types → opnsense-api → opnsense-config → harmony.
Users import types from harmony_types, translations happen transparently
in the infrastructure layer.

Includes 6 new tests verifying all wire value translations.
2026-03-26 10:11:53 -04:00
1b86c895a5 refactor(opnsense): replace stringly-typed fields with enums across Scores
Some checks failed
Run Check Script / check (pull_request) Failing after 19s
Add shared enums for firewall, NAT, and LAGG Score definitions:
- FirewallAction (Pass, Block, Reject)
- Direction (In, Out)
- IpProtocol (Inet, Inet6) — shared across filter, SNAT, DNAT
- NetworkProtocol (Tcp, Udp, TcpUdp, Icmp, Any) — shared across all rule types
- LaggProtocol (Lacp, Failover, LoadBalance, RoundRobin, None)

Combined with the VipMode enum from the previous commit, all OPNsense
Score definitions now use proper types instead of raw strings. Typos in
mode/action/direction/protocol fields are now compile-time errors.
2026-03-26 00:06:40 -04:00
2a15a0d10b refactor(opnsense): use VipMode enum instead of string for VIP mode
Replace the stringly-typed mode field in VipDef with a VipMode enum
(IpAlias, Carp, ProxyArp). Prevents typos and makes the API discoverable
through IDE autocompletion. The as_api_str() method converts to the wire
format expected by OPNsense.
2026-03-25 23:58:01 -04:00
da90dc55ad chore: cargo fmt across workspace
Some checks failed
Run Check Script / check (pull_request) Failing after 19s
2026-03-25 23:20:57 -04:00
516626a0ce docs: add OPNsense VM integration tutorial and architecture challenges
New use-case tutorial walking newcomers through the full OPNsense VM
integration test: system setup, VM boot, SSH config, running all 11
Scores, and understanding the three-layer architecture.

Add architecture-challenges.md analyzing topology evolution during
deployment, runtime plan/validation phase, and TUI as primary interface.
2026-03-25 23:20:45 -04:00
6c664e9f34 docs(roadmap): add phases 7-8 for OPNsense and HA OKD production
Add Phase 7 (OPNsense & Bare-Metal Network Automation) tracking current
progress on OPNsense Scores, codegen, and Brocade integration. Details
the UpdateHostScore requirement and HostNetworkConfigurationScore rework
needed for LAGG LACP 802.3ad.

Add Phase 8 (HA OKD Production Deployment) describing the target
architecture with LAGG/CARP/multi-WAN/BINAT and validation checklist.

Update current state section to reflect opnsense-codegen branch progress.
2026-03-25 23:20:35 -04:00
082ea8a666 feat(harmony): add duration timing to Score::interpret
Every Score execution now logs its status and elapsed time after
completion. The timing is measured in Score::interpret (the central
execution path) so it applies to all Scores automatically.

Example output:
  [VlanScore] SUCCESS in 0.9s — Created 2 VLANs
  [DhcpScore] SUCCESS in 1.8s — Dhcp execution successful
  [LoadBalancerScore] FAILED after 45.3s — connection refused
2026-03-25 23:20:24 -04:00
d33125bba8 feat(okd): automate SCP uploads, implement wait_for_bootstrap_complete
Replace manual scp prompts in bootstrap_02 and ipxe with automated
StaticFilesHttpScore uploads. SCOS installer images and HTTP boot files
now upload via SFTP without operator intervention.

Implement wait_for_bootstrap_complete by shelling out to
openshift-install wait-for bootstrap-complete with stdout/stderr logging.
Previously this was a todo!() that would panic and crash mid-deployment.

Add [Stage 02/Bootstrap] prefixes to all bootstrap_02 log messages.
Improve bootstrap_okd_node outcome to include per-host details with
MAC addresses.
2026-03-25 23:20:16 -04:00
1f0a7ed5a5 feat(opnsense): implement Url::Url support in HTTP and TFTP infra
Replace todo!() in OPNSenseFirewall HTTP and TFTP serve_files with
download-then-upload logic. When a Url::Url is provided, download the
remote file to a temp directory via reqwest, then upload to OPNsense
via the existing SFTP path.

Enables StaticFilesHttpScore and TftpScore to serve files from remote
URLs (e.g. S3) in addition to local folders.
2026-03-25 23:20:07 -04:00
c24fa9315b feat(harmony_assets): S3 credentials, folder upload, 19 tests
Fix S3Store to actually wire access_key_id/secret_access_key from config
into the AWS SDK credential provider. Add force_path_style for custom
endpoints (Ceph, MinIO). Add store_folder() for recursive directory upload.

New CLI command: upload-folder with --public-read/private ACL, env var
fallback for credentials, content-type auto-detection, progress bar.

Fix single-file upload --public-read default (was always true, now false).

Add 19 tests: Asset path computation, LocalStore fetch/cache/404/checksum
with httptest mocks, S3 key extraction, URL generation for custom/AWS
endpoints.
2026-03-25 23:19:58 -04:00
7475e7b75e feat(opnsense): implement remove_static_mapping and list_static_mappings
Wire the existing dnsmasq remove_static_mapping through the OPNSenseFirewall
infra layer. Add list_static_mappings at both config and infra layers for
querying current DHCP host entries. Includes 6 new unit tests with httptest
mocks covering empty, single/multi-MAC, multiple hosts, and skip edge cases.

Foundation for the upcoming UpdateHostScore.
2026-03-25 23:19:47 -04:00
d75ebcbb74 feat(opnsense): VipScore, DnatScore, LaggScore tested with 4-NIC VM
Some checks failed
Run Check Script / check (pull_request) Failing after 16s
Add VIP (IP alias / CARP) and destination NAT (port forwarding) Scores.
Update VM to 4 NICs (LAN, WAN, LAGG member 1, LAGG member 2) so LAGG
can be tested with failover protocol on vtnet2+vtnet3.

All 11 Scores pass end-to-end against OPNsense VM:
- LoadBalancerScore, DhcpScore, TftpScore, NodeExporterScore
- VlanScore (2 VLANs on vtnet0)
- FirewallRuleScore (filter rule with gateway support)
- OutboundNatScore (SNAT), BinatScore (1:1 NAT)
- VipScore (IP alias on LAN)
- DnatScore (port forward 8443→192.168.1.50:443)
- LaggScore (failover LAGG on vtnet2+vtnet3)
2026-03-25 16:59:52 -04:00
cea008e9c9 feat(opnsense): FirewallRuleScore, OutboundNatScore, BinatScore
Add Scores for managing OPNsense new-generation firewall filter rules,
outbound NAT (SNAT), and 1:1 NAT (BINAT) via the REST API.

- opnsense-config: firewall.rs module with idempotent CRUD for filter
  rules, SNAT rules, and BINAT rules (match by description)
- harmony: FirewallRuleScore (with gateway support for multi-WAN),
  OutboundNatScore, BinatScore
- All 3 tested end-to-end against OPNsense VM, idempotent on re-run
- Integration test now exercises 8 Scores total
2026-03-25 16:18:25 -04:00
ac9320fca4 feat(opnsense-codegen): expand custom ArrayField subclasses into full structs
Fix codegen to handle FilterRuleField, SourceNatRuleField, and other
custom *Field types that extend ArrayField. When an XML element has
a custom type AND child elements with type attributes, recursively
parse children into struct fields instead of falling back to
Option<String> stubs.

Also fix hyphenated field names (state-policy → state_policy with
serde rename) and avoid enum name collisions by using the full struct
name as prefix for custom *Field enums.

Regenerated firewall_filter.rs: now has full FirewallFilterRulesRule
(60+ fields including action, direction, gateway, source/dest nets),
FirewallFilterSnatrulesRule, FirewallFilterNptRule,
FirewallFilterOnetooneRule.

New generated modules:
- vip.rs — Virtual IPs (CARP, IP aliases, ProxyARP)
- firewall_alias.rs — Firewall aliases (host, network, port, URL, GeoIP)
- firewall_dnat.rs — Destination NAT / port forwarding rules
2026-03-25 16:00:35 -04:00
2b4c9ac3fb feat(opnsense): VlanScore and LaggScore for network infrastructure
Add VLAN and LAGG management via the OPNsense REST API:

- opnsense-config: vlan.rs and lagg.rs modules with idempotent CRUD
- harmony: VlanScore and LaggScore with OPNSenseFirewall integration
- VlanScore tested end-to-end against OPNsense VM (2 VLANs on vtnet0)
- LaggScore implemented but not VM-testable (needs physical NICs)
- Handle OPNsense select widget fields in VLAN interface responses
- Use direct post_typed calls (addItem/setItem/delItem/reconfigure)
2026-03-25 14:39:30 -04:00
fe22c50122 feat(opnsense): end-to-end validation of all OPNsense Scores
Run LoadBalancerScore, DhcpScore, TftpScore, and NodeExporterScore
against a real OPNsense VM to prove the XML→API migration works.

- Add Router impl for OPNSenseFirewall (gateway + /24 CIDR)
- Fix TFTP/NodeExporter API controller paths (general, not settings)
- Fix TFTP/NodeExporter body wrapper key (general, not module name)
- Fix dnsmasq DHCP range API endpoint (Range, not DhcpRang)
- Fix dnsmasq deserialization for OPNsense select widgets and empty []
- Fix DhcpHostBindingInterpret error propagation (was todo!())
- Expand VM integration example with all 4 Scores + API verification
2026-03-25 14:04:44 -04:00
f8d1f858d0 feat(opnsense): configurable API port, move web GUI to 9443
Add Config::from_credentials_with_api_port() and
OPNSenseFirewall::with_api_port() so the API port is not hardcoded
to 443. This allows running HAProxy on standard ports without
conflicting with the OPNsense web UI.

The integration example now instructs users to change the web GUI
port to 9443 (System > Settings > Administration > TCP Port) as
part of the manual setup, alongside enabling SSH.

The --status command detects whether the API is on 443 or 9443
and advises accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:17:34 -04:00
8a435d2769 docs(opnsense-vm-integration): update README with current status
Document the full workflow, network architecture, manual SSH step,
Docker compatibility, known issues, and future improvements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:07:35 -04:00
095801ac4d fix(opnsense-vm-integration): handle firmware update before package install
When OPNsense is on a base version that needs updating before packages
can install, attempt a firmware update and retry. Use high ports
(16443/18443) for test HAProxy services to avoid conflicting with
the OPNsense web UI on port 443.

Known issue: firmware update on a fresh 26.1 nano image may need
a manual reboot cycle before packages install successfully.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:04:50 -04:00
777213288e fix(opnsense-config): use serde_json::Value for HAProxy config traversal
The hand-written HaproxyGetResponse structs used HashMap which fails
when OPNsense returns [] for empty collections. The generated types
in opnsense-api handle this via opn_map, but opnsense-config had
duplicated structs without that fix.

Replace all hand-written HAProxy response types with serde_json::Value
traversal. This avoids the duplication and handles the []/{} duality.

Also fix integration example:
- Use high ports (16443, 18443) to avoid conflicting with web UI on 443
- Skip package install if already installed
- Use harmony_cli::cli_logger::init() instead of env_logger (safe to
  call multiple times)
- Increase verification timeout to 60s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 11:42:35 -04:00
3fd333caa3 fix(opnsense-vm-integration): detect and fix Docker+libvirt FORWARD conflict
Docker sets iptables FORWARD policy to DROP, which blocks libvirt's
NAT networking (libvirt defaults to nftables which doesn't interact
with Docker's iptables chain).

Fix: setup-libvirt.sh now detects Docker and offers to switch libvirt
to the iptables firewall backend, so both sets of rules coexist.
The --check command warns about this mismatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 11:08:03 -04:00
c2d817180b refactor(opnsense-vm-integration): clean two-phase workflow
Restructure the example into two clear phases:

Phase 1 (--boot): creates KVM network + VM, waits for web UI,
prints instructions for enabling SSH via the OPNsense GUI.

Phase 2 (default run): checks SSH is reachable, creates API key,
installs HAProxy, runs LoadBalancerScore, verifies via API.

The config.xml injection sets vtnet0=LAN (192.168.1.1) and
vtnet1=WAN (DHCP). SSH must be enabled manually in the web UI
because OPNsense has no REST API for SSH management and the
config.xml injection doesn't reliably enable sshd.

Future: use a pre-customized OPNsense image on S3 for CI.

Also add show_ssh_config example to opnsense-api crate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 10:26:23 -04:00
31c3a52750 feat(opnsense): config.xml injection for nano image + dual NIC setup
Add opnsense::image module for customizing OPNsense nano disk images:
- find_config_offset(): scans raw image for config.xml location
- replace_config_xml(): overwrites config with null-padded replacement
- minimal_config_xml(): generates WAN+LAN config for virtio NICs
- Supports auto-scanning for unknown images

KVM improvements:
- disk_from_path(): attach existing disk images (not just new volumes)
- start_vm() now idempotent (skips if already running)
- cdrom uses SATA bus instead of IDE (q35 compatibility)

Integration example updates:
- LAN on 192.168.1.0/24 (matches OPNsense defaults, host reachable)
- WAN on libvirt default network (internet access)
- Config.xml injection replaces em0/em1 with vtnet0/vtnet1
- API key creation via PHP script (writes to file, avoids escaping)

Status: VM boots, web UI responds at 192.168.1.1, interfaces assigned.
Remaining: SSH enablement in config.xml, API key creation, WAN subnet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 09:30:36 -04:00
2e3af21b61 chore(opnsense-vm-integration): add setup-libvirt.sh script
Interactive script that installs packages, adds user to libvirt group,
starts libvirtd, and creates the default storage pool. Asks before
each step (or run with --yes for non-interactive).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:52:10 -04:00
bc1f8e8a9d feat(opnsense-vm-integration): add --check, --setup, --download subcommands
Add prerequisite checking (libvirtd, group membership, storage pool,
bunzip2) with clear error messages and fix suggestions.

Add --setup to print the exact sudo commands needed for initial setup.
Add --download to pre-fetch and decompress the OPNsense nano image.

Full flow: download image → create network with DHCP → boot VM →
discover IP via libvirt lease → wait for API → create API key via
SSH → install HAProxy + Caddy → run LoadBalancerScore → verify.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:45:41 -04:00
7eef3115e9 feat(kvm): add VM IP discovery, DHCP networks, and OPNsense integration example
KVM module enhancements:
- Add vm_ip() and wait_for_ip() to KvmExecutor using
  Domain::interface_addresses() for DHCP IP discovery
- Add DHCP range and static host entries to NetworkConfig/NetworkConfigBuilder
- Generate DHCP XML in network definitions for libvirt's built-in DHCP
- Export DhcpHost type

OPNsense VM integration example (opnsense-vm-integration):
- Boots OPNsense nano VM via KVM
- Discovers IP via libvirt DHCP lease query
- Creates API key via SSH
- Installs HAProxy + Caddy via firmware API
- Runs LoadBalancerScore (2 services: K8s API + HTTPS)
- Verifies HAProxy configuration via API

22 KVM unit tests pass (3 new DHCP tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:37:13 -04:00
d48200b3d5 docs(kvm): document XML template decision and upstream tracking
Explain why we use string templates for libvirt XML generation and
what the path to typed structs looks like. The best candidate is
libvirt-rust-xml (gen branch) which generates Rust structs from
libvirt's RelaxNG schemas via relaxng-gen, but it doesn't compile
yet (virtxml-domain has 6 errors as of baca481).

Also fix dead code in format_cdrom (redundant device_type branch).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:04:53 -04:00
b18c8d534a feat(kvm): add 17 unit tests and VM examples for all infrastructure patterns
Add comprehensive XML generation tests covering: multi-disk VMs,
multi-NIC configurations, MAC addresses, boot order, memory conversion,
sequential disk naming, custom storage pools, NAT/route/isolated
networks, volume sizing, builder defaults, q35 machine type, and
serial console.

Add kvm-vm-examples binary with 5 scenarios:
- alpine: minimal 512MB VM, fast boot for testing
- ubuntu: standard server with 25GB disk
- worker: multi-disk (60G OS + 2x100G Ceph OSD) for storage nodes
- gateway: dual-NIC (WAN NAT + LAN isolated) for firewall/router
- ha-cluster: full 7-VM deployment (gateway + 3 CP + 3 workers)

Each scenario has clean and status subcommands.

19 KVM unit tests pass (17 new + 2 existing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 21:16:08 -04:00
474e5a8dd2 test(opnsense-api): add 11 e2e tests against real OPNsense instance
Add integration tests that verify the full stack against a real OPNsense
VM. Tests are #[ignore]d by default — run with:

  OPNSENSE_TEST_URL=https://10.99.99.1/api \
  OPNSENSE_TEST_KEY=key OPNSENSE_TEST_SECRET=secret \
  cargo test -p opnsense-api --test e2e_test -- --ignored

Tests cover:
- Firmware: status, package list
- Dnsmasq: settings/get, CRUD host lifecycle, add_static_mapping via config
- HAProxy: settings/get, CRUD server, configure_service + idempotency
- VLAN, WireGuard, Firewall: settings/get

Each test cleans up after itself. Do NOT run against production.

Also make DhcpConfigDnsMasq::new and LoadBalancerConfig::new pub for
external test usage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 20:39:54 -04:00
dd92e15f96 test(opnsense-config): restore unit tests with httptest mocks
Add 14 unit tests covering the critical business logic:

Dnsmasq (11 tests):
- add_static_mapping: create new, update by IP, update by hostname,
  hostname/domain splitting, duplicate MAC handling
- Conflict detection: IP/hostname in different entries, multiple matches
- remove_static_mapping: partial remove, full delete, case insensitivity

Load balancer (3 tests):
- configure_service creates all components (healthcheck→server→backend→frontend)
- Idempotent replacement on same bind address (cascade delete then re-create)
- Isolation between services on different bind addresses

Tests use httptest to mock the OPNsense API — no VM or real firewall needed.
All 100 tests pass across the workspace (0 failures).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 20:33:31 -04:00
c608975d30 feat(opnsense-config): replace XML backend with REST API
Replace opnsense-config-xml dependency with opnsense-api. All
configuration CRUD now goes through the OPNsense REST API instead
of SSH + XML editing of /conf/config.xml.

Key changes:
- Config struct holds OpnsenseClient + SSH shell (for file ops only)
- Module handlers (dnsmasq, haproxy, caddy, tftp, node_exporter) are
  now API-backed with async methods
- apply()/save() are no-ops — each module calls reconfigure after mutations
- install_package uses firmware API with polling
- LoadBalancer uses new domain types (LbFrontend, LbBackend, LbServer,
  LbHealthCheck) instead of XML types, with UUID chaining via API
- Dnsmasq conflict detection logic preserved, adapted for API HashMap
- RwLock<Config> replaced with Arc<Config> — Config is now stateless

Benefits over XML approach:
- Per-module soft reload instead of "reload all services"
- Server-side validation of all changes
- No more hash-based race condition detection
- No more fragile XML schema coupling

SSH retained for: file uploads, PXE config writing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 19:35:40 -04:00
6c9472212c docs(opnsense-api): add README with example usage
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 19:07:32 -04:00
bc4dcdf942 feat(opnsense): upgrade to 26.1.5, handle array select widgets
- Pin vendor/core submodule to 26.1.5 tag (matches running firewall)
- Regenerate dnsmasq from model v1.0.9 (migrated during firmware upgrade)
- Handle array-style select widgets in enum deserialization: OPNsense
  sometimes returns [{value, selected}, ...] instead of {key: {value, selected}}
- Add firmware_upgrade and reboot examples for managing OPNsense updates
- All 7 modules validated against live OPNsense 26.1.5:
  dnsmasq, haproxy, caddy, vlan, lagg, wireguard, firewall

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 19:03:04 -04:00
8a7cbf4836 fix(opnsense-codegen): preserve unknown enum values with Other(String)
Replace lossy enum deserialization (unknown variants → None) with
Other(String) catch-all variant. This ensures unknown wire values
survive round-trips: reading an object and POSTing it back will not
silently destroy field values that the codegen doesn't recognize.

This is critical for data integrity — in a read-modify-write cycle,
dropping an unknown enum value would overwrite it with empty on the
next POST.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 18:17:32 -04:00
4af5e7ac19 feat(opnsense): generate types for all 7 modules with codegen fixes
Generate typed API models for HAProxy, Caddy, Firewall, VLAN, LAGG,
WireGuard (client/server/general), and regenerate Dnsmasq. All core
modules validated against a live OPNsense 26.1.2 instance.

Codegen improvements:
- Add --module-name and --api-key CLI flags for controlling output
  filenames and API response envelope keys
- Fix enum variant names starting with digits (prefix with V)
- Use value="" XML attribute for wire values instead of element names
- Handle unknown *Field types as opn_string (select widget safe)
- Forgiving enum deserialization (warn instead of error on unknown)
- Handle empty arrays in opn_string deserializer

Add per-module examples (list_haproxy, list_caddy, list_vlan, etc.)
and utility examples (raw_get, check_package, install_and_wait).
Extract shared client setup into examples/common/mod.rs.

Fix post_typed sending empty JSON body ({}) instead of no body,
which was causing 400 errors on firmware endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 18:11:02 -04:00
0dc2f94b06 feat(opnsense-api): add CRUD methods and common response types
Add entity-level CRUD operations (get_item, add_item, set_item,
del_item, search_items) and service management (reconfigure,
service_status) to OpnsenseClient. These map directly to OPNsense's
MVC controller patterns.

Add response module with UuidResponse, StatusResponse, and
SearchResponse<T> covering the standard OPNsense API response shapes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:35:40 -04:00
eff75f4118 misc: Add test dnsmasq end to end codegen 2026-03-24 15:29:01 -04:00
f28edb3134 feat(opnsense-codegen): codegen now works for dnsmasq end to end from the model to the api 2026-03-24 15:28:00 -04:00
88e6990051 feat(opnsense-api): examples to list packages and dnsmasq settings now working 2026-03-24 14:07:47 -04:00
8e9f8ce405 wip: opnsense-api crate to replace opnsense-config-xml 2026-03-24 13:26:36 -04:00
d87aa3c7e9 fix opnsense sumbodule url 2026-03-24 10:51:38 -04:00
90ec2b524a wip(codegen): generates ir and rust code successfully but not really tested yet 2026-03-24 10:23:52 -04:00
5572f98d5f wip(opnsense-codegen): Can now create IR that looks good from example, successfully parses real models too 2026-03-24 09:32:21 -04:00
8024e0d5c3 wip: opnsense codegen 2026-03-24 07:13:53 -04:00
238e7da175 feat: opnsense codegen basic example scaffolded, now we can start implementing real models 2026-03-23 23:27:40 -04:00
bf84bffd57 wip: config + secret merge with e2e sso examples incoming 2026-03-23 23:26:42 -04:00
d4613e42d3 wip: openbao + zitadel e2e setup and test for harmony_config 2026-03-22 21:27:06 -04:00
203 changed files with 54450 additions and 2396 deletions

12
.gitmodules vendored
View File

@@ -1,3 +1,15 @@
[submodule "examples/try_rust_webapp/tryrust.org"]
path = examples/try_rust_webapp/tryrust.org
url = https://github.com/rust-dd/tryrust.org.git
[submodule "/home/jeangab/work/nationtech/harmony2/opnsense-codegen/vendor/core"]
path = /home/jeangab/work/nationtech/harmony2/opnsense-codegen/vendor/core
url = https://github.com/opnsense/core.git
[submodule "/home/jeangab/work/nationtech/harmony2/opnsense-codegen/vendor/plugins"]
path = /home/jeangab/work/nationtech/harmony2/opnsense-codegen/vendor/plugins
url = https://github.com/opnsense/plugins.git
[submodule "opnsense-codegen/vendor/core"]
path = opnsense-codegen/vendor/core
url = https://github.com/opnsense/core.git
[submodule "opnsense-codegen/vendor/plugins"]
path = opnsense-codegen/vendor/plugins
url = https://github.com/opnsense/plugins.git

146
CLAUDE.md Normal file
View File

@@ -0,0 +1,146 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Build & Test Commands
```bash
# Full CI check (check + fmt + clippy + test)
./build/check.sh
# Individual commands
cargo check --all-targets --all-features --keep-going
cargo fmt --check # Check formatting
cargo clippy # Lint
cargo test # Run all tests
# Run a single test
cargo test -p <crate_name> <test_name>
# Run a specific example
cargo run -p <example_crate_name>
# Build the mdbook documentation
mdbook build
```
## What Harmony Is
Harmony is the orchestration framework powering NationTech's vision of **decentralized micro datacenters** — small computing clusters deployed in homes, offices, and community spaces instead of hyperscaler facilities. The goal: make computing cleaner, more resilient, locally beneficial, and resistant to centralized points of failure (including geopolitical threats).
Harmony exists because existing IaC tools (Terraform, Ansible, Helm) are trapped in a **YAML mud pit**: static configuration files validated only at runtime, fragmented across tools, with errors surfacing at 3 AM instead of at compile time. Harmony replaces this entire class of tools with a single Rust codebase where **the compiler catches infrastructure misconfigurations before anything is deployed**.
This is not a wrapper around existing tools. It is a paradigm shift: infrastructure-as-real-code with compile-time safety guarantees that no YAML/HCL/DSL-based tool can provide.
## The Score-Topology-Interpret Pattern
This is the core design pattern. Understand it before touching the codebase.
**Score** — declarative desired state. A Rust struct generic over `T: Topology` that describes *what* you want (e.g., "a PostgreSQL cluster", "DNS records for these hosts"). Scores are serializable, cloneable, idempotent.
**Topology** — infrastructure capabilities. Represents *where* things run and *what the environment can do*. Exposes capabilities as traits (`DnsServer`, `K8sclient`, `HelmCommand`, `LoadBalancer`, `Firewall`, etc.). Examples: `K8sAnywhereTopology` (local K3D or any K8s cluster), `HAClusterTopology` (bare-metal HA with redundant firewalls/switches).
**Interpret** — execution glue. Translates a Score into concrete operations against a Topology's capabilities. Returns an `Outcome` (SUCCESS, NOOP, FAILURE, RUNNING, QUEUED, BLOCKED).
**The key insight — compile-time safety through trait bounds:**
```rust
impl<T: Topology + DnsServer + DhcpServer> Score<T> for DnsScore { ... }
```
The compiler rejects any attempt to use `DnsScore` with a Topology that doesn't implement `DnsServer` and `DhcpServer`. Invalid infrastructure configurations become compilation errors, not runtime surprises.
**Higher-order topologies** compose transparently:
- `FailoverTopology<T>` — primary/replica orchestration
- `DecentralizedTopology<T>` — multi-site coordination
If `T: PostgreSQL`, then `FailoverTopology<T>: PostgreSQL` automatically via blanket impls. Zero boilerplate.
## Architecture (Hexagonal)
```
harmony/src/
├── domain/ # Core domain — the heart of the framework
│ ├── score.rs # Score trait (desired state)
│ ├── topology/ # Topology trait + implementations
│ ├── interpret/ # Interpret trait + InterpretName enum (25+ variants)
│ ├── inventory/ # Physical infrastructure metadata (hosts, switches, mgmt interfaces)
│ ├── executors/ # Executor trait definitions
│ └── maestro/ # Orchestration engine (registers scores, manages topology state, executes)
├── infra/ # Infrastructure adapters (driven ports)
│ ├── opnsense/ # OPNsense firewall adapter
│ ├── brocade.rs # Brocade switch adapter
│ ├── kube.rs # Kubernetes executor
│ └── sqlx.rs # Database executor
└── modules/ # Concrete deployment modules (23+)
├── k8s/ # Kubernetes (namespaces, deployments, ingress)
├── postgresql/ # CloudNativePG clusters + multi-site failover
├── okd/ # OpenShift bare-metal from scratch
├── helm/ # Helm chart inflation → vanilla K8s YAML
├── opnsense/ # OPNsense (DHCP, DNS, etc.)
├── monitoring/ # Prometheus, Alertmanager, Grafana
├── kvm/ # KVM virtual machine management
├── network/ # Network services (iPXE, TFTP, bonds)
└── ...
```
Domain types to know: `Inventory` (read-only physical infra context), `Maestro<T>` (orchestrator — calls `topology.ensure_ready()` then executes scores), `Outcome` / `InterpretError` (execution results).
## Key Crates
| Crate | Purpose |
|---|---|
| `harmony` | Core framework: domain, infra adapters, deployment modules |
| `harmony_cli` | CLI + optional TUI (`--features tui`) |
| `harmony_config` | Unified config+secret management (env → SQLite → OpenBao → interactive prompt) |
| `harmony_secret` / `harmony_secret_derive` | Secret backends (LocalFile, OpenBao, Infisical) |
| `harmony_execution` | Execution engine |
| `harmony_agent` / `harmony_inventory_agent` | Persistent agent framework (NATS JetStream mesh), hardware discovery |
| `harmony_assets` | Asset management (URLs, local cache, S3) |
| `harmony_composer` | Infrastructure composition tool |
| `harmony-k8s` | Kubernetes utilities |
| `k3d` | Local K3D cluster management |
| `brocade` | Brocade network switch integration |
## OPNsense Crates
The `opnsense-codegen` and `opnsense-api` crates exist because OPNsense's automation ecosystem is poor — no typed API client exists. These are support crates, not the core of Harmony.
- `opnsense-codegen`: XML model files → IR → Rust structs with serde helpers for OPNsense wire format quirks (`opn_bool` for "0"/"1" strings, `opn_u16`/`opn_u32` for string-encoded numbers). Vendor sources are git submodules under `opnsense-codegen/vendor/`.
- `opnsense-api`: Hand-written `OpnsenseClient` + generated model types in `src/generated/`.
## Key Design Decisions (ADRs in docs/adr/)
- **ADR-001**: Rust chosen for type system, refactoring safety, and performance
- **ADR-002**: Hexagonal architecture — domain isolated from adapters
- **ADR-003**: Infrastructure abstractions at domain level, not provider level (no vendor lock-in)
- **ADR-005**: Custom Rust DSL over YAML/Score-spec — real language, Cargo deps, composable
- **ADR-007**: K3D as default runtime (K8s-certified, lightweight, cross-platform)
- **ADR-009**: Helm charts inflated to vanilla K8s YAML, then deployed via existing code paths
- **ADR-015**: Higher-order topologies via blanket trait impls (zero-cost composition)
- **ADR-016**: Agent-based architecture with NATS JetStream for real-time failover and distributed consensus
- **ADR-020**: Unified config+secret management — Rust struct is the schema, resolution chain: env → store → prompt
## Capability and Score Design Rules
**Capabilities are industry concepts, not tools.** A capability trait represents a standard infrastructure need (e.g., `DnsServer`, `LoadBalancer`, `Router`, `CertificateManagement`) that can be fulfilled by different products. OPNsense provides `DnsServer` today; CoreDNS or Route53 could provide it tomorrow. Scores must not break when the backend changes.
**Exception:** When the developer fundamentally needs to know the implementation. `PostgreSQL` is a capability (not `Database`) because the developer writes PostgreSQL-specific SQL and replication configs. Swapping to MariaDB would break the application, not just the infrastructure.
**Test:** If you could swap the underlying tool without rewriting any Score that uses the capability, the boundary is correct.
**Don't name capabilities after tools.** `SecretVault` not `OpenbaoStore`. `IdentityProvider` not `ZitadelAuth`. Think: what is the core developer need that leads to using this tool?
**Scores encapsulate operational complexity.** Move procedural knowledge (init sequences, retry logic, distribution-specific config) into Scores. A high-level example should be ~15 lines, not ~400 lines of imperative orchestration.
**Scores must be idempotent.** Running twice = same result as once. Use create-or-update, handle "already exists" gracefully.
**Scores must not depend on execution order.** Declare capability requirements via trait bounds, don't assume another Score ran first. If Score B needs what Score A provides, Score B should declare that capability as a trait bound.
See `docs/guides/writing-a-score.md` for the full guide.
## Conventions
- **Rust edition 2024**, resolver v2
- **Conventional commits**: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
- **Small PRs**: max ~200 lines (excluding generated code), single-purpose
- **License**: GNU AGPL v3
- **Quality bar**: This framework demands high-quality engineering. The type system is a feature, not a burden. Leverage it. Prefer compile-time guarantees over runtime checks. Abstractions should be domain-level, not provider-specific.

228
Cargo.lock generated
View File

@@ -148,7 +148,7 @@ dependencies = [
"bytes",
"bytestring",
"cfg-if",
"cookie",
"cookie 0.16.2",
"derive_more",
"encoding_rs",
"foldhash",
@@ -1681,6 +1681,34 @@ dependencies = [
"version_check",
]
[[package]]
name = "cookie"
version = "0.17.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7efb37c3e1ccb1ff97164ad95ac1606e8ccd35b3fa0a7d99a304c7f4a428cc24"
dependencies = [
"percent-encoding",
"time",
"version_check",
]
[[package]]
name = "cookie_store"
version = "0.20.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "387461abbc748185c3a6e1673d826918b450b87ff22639429c694619a83b6cf6"
dependencies = [
"cookie 0.17.0",
"idna 0.3.0",
"log",
"publicsuffix",
"serde",
"serde_derive",
"serde_json",
"time",
"url",
]
[[package]]
name = "core-foundation"
version = "0.9.4"
@@ -2260,6 +2288,15 @@ dependencies = [
"dirs-sys",
]
[[package]]
name = "dirs"
version = "6.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3e8aa94d75141228480295a7d0e7feb620b1a5ad9f12bc40be62411e38cce4e"
dependencies = [
"dirs-sys",
]
[[package]]
name = "dirs-sys"
version = "0.5.0"
@@ -2597,6 +2634,34 @@ dependencies = [
"url",
]
[[package]]
name = "example-harmony-sso"
version = "0.1.0"
dependencies = [
"anyhow",
"clap",
"directories",
"env_logger",
"harmony",
"harmony-k8s",
"harmony_cli",
"harmony_config",
"harmony_macros",
"harmony_secret",
"harmony_types",
"interactive-parse",
"k3d-rs",
"k8s-openapi",
"kube",
"log",
"reqwest 0.12.28",
"schemars 0.8.22",
"serde",
"serde_json",
"tokio",
"url",
]
[[package]]
name = "example-k8s-drain-node"
version = "0.1.0"
@@ -3486,6 +3551,7 @@ dependencies = [
"helm-wrapper-rs",
"hex",
"http 1.4.0",
"httptest",
"inquire 0.7.5",
"k3d-rs",
"k8s-openapi",
@@ -3495,6 +3561,7 @@ dependencies = [
"log",
"non-blank-string-rs",
"once_cell",
"opnsense-api",
"opnsense-config",
"opnsense-config-xml",
"option-ext",
@@ -3611,6 +3678,7 @@ dependencies = [
"blake3",
"clap",
"directories",
"env_logger",
"futures-util",
"httptest",
"indicatif",
@@ -3768,12 +3836,14 @@ dependencies = [
"lazy_static",
"log",
"pretty_assertions",
"reqwest 0.12.28",
"schemars 0.8.22",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.18",
"tokio",
"url",
"vaultrs",
]
@@ -4322,6 +4392,16 @@ version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b9e0384b61958566e926dc50660321d12159025e767c18e043daf26b70104c39"
[[package]]
name = "idna"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e14ddfc70884202db2244c223200c204c2bda1bc6e0998d11b5e024d657209e6"
dependencies = [
"unicode-bidi",
"unicode-normalization",
]
[[package]]
name = "idna"
version = "1.1.0"
@@ -4779,6 +4859,17 @@ dependencies = [
"tracing",
]
[[package]]
name = "kvm-vm-examples"
version = "0.1.0"
dependencies = [
"clap",
"env_logger",
"harmony",
"log",
"tokio",
]
[[package]]
name = "language-tags"
version = "0.3.2"
@@ -5034,6 +5125,29 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "network_stress_test"
version = "0.1.0"
dependencies = [
"actix-web",
"askama",
"async-stream",
"async-trait",
"brocade",
"chrono",
"env_logger",
"harmony_types",
"log",
"opnsense-api",
"rand 0.9.2",
"russh",
"russh-keys",
"serde",
"serde_json",
"sqlx",
"tokio",
]
[[package]]
name = "newline-converter"
version = "0.2.2"
@@ -5246,6 +5360,43 @@ version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7c87def4c32ab89d880effc9e097653c8da5d6ef28e6b539d313baaacfbafcbe"
[[package]]
name = "opnsense-api"
version = "0.1.0"
dependencies = [
"async-trait",
"base64 0.22.1",
"env_logger",
"harmony_types",
"http 1.4.0",
"inquire 0.7.5",
"log",
"opnsense-config",
"pretty_assertions",
"reqwest 0.12.28",
"serde",
"serde_json",
"thiserror 2.0.18",
"tokio",
"tokio-test",
]
[[package]]
name = "opnsense-codegen"
version = "0.1.0"
dependencies = [
"clap",
"env_logger",
"heck",
"log",
"pretty_assertions",
"quick-xml",
"serde",
"serde_json",
"thiserror 2.0.18",
"toml",
]
[[package]]
name = "opnsense-config"
version = "0.1.0"
@@ -5254,8 +5405,10 @@ dependencies = [
"async-trait",
"chrono",
"env_logger",
"harmony_types",
"httptest",
"log",
"opnsense-config-xml",
"opnsense-api",
"pretty_assertions",
"russh",
"russh-keys",
@@ -5266,6 +5419,7 @@ dependencies = [
"thiserror 1.0.69",
"tokio",
"tokio-stream",
"tokio-test",
"tokio-util",
"uuid",
]
@@ -5288,6 +5442,46 @@ dependencies = [
"yaserde_derive",
]
[[package]]
name = "opnsense-pair-integration"
version = "0.1.0"
dependencies = [
"dirs",
"env_logger",
"harmony",
"harmony_cli",
"harmony_inventory_agent",
"harmony_macros",
"harmony_types",
"log",
"opnsense-api",
"opnsense-config",
"reqwest 0.12.28",
"russh",
"serde_json",
"tokio",
]
[[package]]
name = "opnsense-vm-integration"
version = "0.1.0"
dependencies = [
"dirs",
"env_logger",
"harmony",
"harmony_cli",
"harmony_inventory_agent",
"harmony_macros",
"harmony_types",
"log",
"opnsense-api",
"opnsense-config",
"reqwest 0.12.28",
"russh",
"serde_json",
"tokio",
]
[[package]]
name = "option-ext"
version = "0.2.0"
@@ -5750,12 +5944,38 @@ dependencies = [
"unicode-ident",
]
[[package]]
name = "psl-types"
version = "2.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33cb294fe86a74cbcf50d4445b37da762029549ebeea341421c7c70370f86cac"
[[package]]
name = "publicsuffix"
version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6f42ea446cab60335f76979ec15e12619a2165b5ae2c12166bef27d283a9fadf"
dependencies = [
"idna 1.1.0",
"psl-types",
]
[[package]]
name = "punycode"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e9e1dcb320d6839f6edb64f7a4a59d39b30480d4d1765b56873f7c858538a5fe"
[[package]]
name = "quick-xml"
version = "0.37.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "331e97a1af0bf59823e6eadffe373d7b27f485be8748f71471c662c1f269b7fb"
dependencies = [
"memchr",
"serde",
]
[[package]]
name = "quinn"
version = "0.11.9"
@@ -6047,6 +6267,8 @@ checksum = "dd67538700a17451e7cba03ac727fb961abb7607553461627b97de0b89cf4a62"
dependencies = [
"base64 0.21.7",
"bytes",
"cookie 0.17.0",
"cookie_store",
"encoding_rs",
"futures-core",
"futures-util",
@@ -8199,7 +8421,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ff67a8a4397373c3ef660812acab3268222035010ab8680ec4215f38ba3d0eed"
dependencies = [
"form_urlencoded",
"idna",
"idna 1.1.0",
"percent-encoding",
"serde",
"serde_derive",

View File

@@ -16,6 +16,7 @@ members = [
"harmony_inventory_agent",
"harmony_secret_derive",
"harmony_secret",
"network_stress_test",
"examples/kvm_okd_ha_cluster",
"examples/example_linux_vm",
"harmony_config_derive",
@@ -25,7 +26,7 @@ members = [
"harmony_agent/deploy",
"harmony_node_readiness",
"harmony-k8s",
"harmony_assets",
"harmony_assets", "opnsense-codegen", "opnsense-api",
]
[workspace.package]
@@ -43,6 +44,7 @@ tokio = { version = "1.40", features = [
"io-util",
"fs",
"macros",
"net",
"rt-multi-thread",
] }
tokio-retry = "0.3.0"
@@ -93,3 +95,4 @@ reqwest = { version = "0.12", features = [
assertor = "0.0.4"
tokio-test = "0.4"
anyhow = "1.0"
clap = { version = "4", features = ["derive"] }

View File

@@ -1,6 +1,6 @@
# Harmony Roadmap
Six phases to take Harmony from working prototype to production-ready open-source project.
Eight phases to take Harmony from working prototype to production-ready open-source project.
| # | Phase | Status | Depends On | Detail |
|---|-------|--------|------------|--------|
@@ -10,17 +10,29 @@ Six phases to take Harmony from working prototype to production-ready open-sourc
| 4 | [Publish to GitHub](ROADMAP/04-publish-github.md) | Not started | 3 | Clean history, set up GitHub as community hub, CI on self-hosted runners |
| 5 | [E2E tests: PostgreSQL & RustFS](ROADMAP/05-e2e-tests-simple.md) | Not started | 1 | k3d-based test harness, two passing E2E tests, CI job |
| 6 | [E2E tests: OKD HA on KVM](ROADMAP/06-e2e-tests-kvm.md) | Not started | 5 | KVM test infrastructure, full OKD installation test, nightly CI |
| 7 | [OPNsense & Bare-Metal Network Automation](ROADMAP/07-opnsense-bare-metal.md) | **In progress** | — | Full OPNsense API coverage, Brocade switch integration, HA cluster network provisioning |
| 8 | [HA OKD Production Deployment](ROADMAP/08-ha-okd-production.md) | Not started | 7 | LAGG/CARP/multi-WAN/BINAT cluster with UpdateHostScore, end-to-end bare-metal automation |
| 9 | [SSO + Config Hardening](ROADMAP/09-sso-config-hardening.md) | **In progress** | 1 | Builder pattern for OpenbaoSecretStore, ZitadelScore PG fix, CoreDNSRewriteScore, integration tests |
## Current State (as of branch `feature/kvm-module`)
## Current State (as of branch `feat/opnsense-codegen`)
- `harmony_config` crate exists with `EnvSource`, `LocalFileSource`, `PromptSource`, `StoreSource`. 12 unit tests. **Zero consumers** in workspace — everything still uses `harmony_secret::SecretManager` directly (19 call sites).
- `harmony_assets` crate exists with `Asset`, `LocalCache`, `LocalStore`, `S3Store`. **No tests. Zero consumers.** The `k3d` crate has its own `DownloadableAsset` with identical functionality and full test coverage.
- `harmony_secret` has `LocalFileSecretStore`, `OpenbaoSecretStore` (token/userpass only), `InfisicalSecretStore`. Works but no Zitadel OIDC integration.
- `harmony_secret` has `LocalFileSecretStore`, `OpenbaoSecretStore` (token/userpass/OIDC device flow + JWT exchange), `InfisicalSecretStore`. Zitadel OIDC integration **implemented** with session caching.
- **SSO example** (`examples/harmony_sso/`): deploys Zitadel + OpenBao on k3d, provisions identity resources, authenticates via device flow, stores config in OpenBao. `OpenbaoSetupScore` and `ZitadelSetupScore` encapsulate day-two operations.
- KVM module exists on this branch with `KvmExecutor`, VM lifecycle, ISO download, two examples (`example_linux_vm`, `kvm_okd_ha_cluster`).
- RustFS module exists on `feat/rustfs` branch (2 commits ahead of master).
- 39 example crates, **zero E2E tests**. Unit tests pass across workspace (~240 tests).
- CI runs `cargo check`, `fmt`, `clippy`, `test` on Gitea. No E2E job.
### OPNsense & Bare-Metal (as of branch `feat/opnsense-codegen`)
- **9 OPNsense Scores** implemented: VlanScore, LaggScore, VipScore, DnatScore, FirewallRuleScore, OutboundNatScore, BinatScore, NodeExporterScore, OPNsenseShellCommandScore. All tested against a 4-NIC VM.
- **opnsense-codegen** pipeline operational: XML → IR → typed Rust structs with serde helpers. 11 generated API modules (26.5K lines).
- **opnsense-config** has 13 modules: DHCP (dnsmasq), DNS, firewall, LAGG, VIP, VLAN, load balancer (HAProxy), Caddy, TFTP, node exporter, and legacy DHCP.
- **Brocade switch integration** on `feat/brocade-client-add-vlans`: full VLAN CRUD, interface speed config, port-channel management, new `BrocadeSwitchConfigurationScore`. Breaking API changes (InterfaceConfig replaces tuples).
- **Missing for production**: `UpdateHostScore` (update MAC in DHCP for PXE boot + host network setup for LAGG LACP 802.3ad), `HostNetworkConfigurationScore` needs rework for LAGG/LACP (currently only creates bonds, doesn't configure LAGG on OPNsense side), brocade branch needs merge and API adaptation in `harmony/src/infra/brocade.rs`.
## Guiding Principles
- **Zero-setup first**: A new user clones, runs `cargo run`, gets prompted for config, values persist to local SQLite. No env vars, no external services required.

View File

@@ -16,7 +16,7 @@ Make `harmony_config` production-ready with a seamless first-run experience: clo
- `SqliteSource`**NEW** reads/writes to SQLite database
- `PromptSource` — returns `None` / no-op on set (placeholder for TUI integration)
- `StoreSource<S: SecretStore>` — wraps any `harmony_secret::SecretStore` backend
- 24 unit tests (mock source, env, local file, sqlite, prompt, integration)
- 26 unit tests (mock source, env, local file, sqlite, prompt, integration, store graceful fallback)
- Global `CONFIG_MANAGER` static with `init()`, `get()`, `get_or_prompt()`, `set()`
- Two examples: `basic` and `prompting` in `harmony_config/examples/`
- **Zero workspace consumers** — nothing calls `harmony_config` yet
@@ -130,12 +130,461 @@ for source in &self.sources {
### 1.4 Validate Zitadel + OpenBao integration path ⏳
**Status**: Not yet implemented
**Status**: Planning phase - detailed execution plan below
Remaining work:
- Validate that `ConfigManager::new(vec![EnvSource, SqliteSource, StoreSource<Openbao>])` compiles
- When OpenBao is unreachable, chain falls through to SQLite gracefully
- Document target Zitadel OIDC flow as ADR
**Background**: ADR 020-1 documents the target architecture for Zitadel OIDC + OpenBao integration. This task validates the full chain by deploying Zitadel and OpenBao on a local k3d cluster and demonstrating an end-to-end example.
**Architecture Overview**:
```
┌─────────────────────────────────────────────────────────────────────┐
│ Harmony CLI / App │
│ │
│ ConfigManager: │
│ 1. EnvSource ← HARMONY_CONFIG_* env vars (highest priority) │
│ 2. SqliteSource ← ~/.local/share/harmony/config/config.db │
│ 3. StoreSource ← OpenBao (team-scale, via Zitadel OIDC) │
│ │
│ When StoreSource fails (OpenBao unreachable): │
│ → returns Ok(None), chain falls through to SqliteSource │
└─────────────────────────────────────────────────────────────────────┘
┌──────────────────┐ ┌──────────────────┐
│ Zitadel │ │ OpenBao │
│ (IdP + OIDC) │ │ (Secret Store) │
│ │ │ │
│ Device Auth │────JWT──▶│ JWT Auth │
│ Flow (RFC 8628)│ │ Method │
└──────────────────┘ └──────────────────┘
```
**Prerequisites**:
- Docker running (for k3d)
- Rust toolchain (edition 2024)
- Network access to download Helm charts
- `kubectl` (installed automatically with k3d, or pre-installed)
**Step-by-Step Execution Plan**:
#### Step 1: Create k3d cluster for local development
When you run `cargo run -p example-zitadel` (or any example using `K8sAnywhereTopology::from_env()`), Harmony automatically provisions a k3d cluster if one does not exist. By default:
- `use_local_k3d = true` (env: `HARMONY_USE_LOCAL_K3D`, default `true`)
- `autoinstall = true` (env: `HARMONY_AUTOINSTALL`, default `true`)
- Cluster name: **`harmony`** (hardcoded in `K3DInstallationScore::default()`)
- k3d binary is downloaded to `~/.local/share/harmony/k3d/`
- Kubeconfig is merged into `~/.kube/config`, context set to `k3d-harmony`
No manual `k3d cluster create` is needed. If you want to create the cluster manually first:
```bash
# Install k3d (requires sudo or install to user path)
curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
# Create the cluster with the same name Harmony expects
k3d cluster create harmony
kubectl cluster-info --context k3d-harmony
```
**Validation**: `kubectl get nodes --context k3d-harmony` shows 1 server node (k3d default)
**Note**: The existing examples use hardcoded external hostnames (e.g., `sso.sto1.nationtech.io`) for ingress. On a local k3d cluster, these hostnames are not routable. For local development you must either:
- Use `kubectl port-forward` to access services directly
- Configure `/etc/hosts` entries pointing to `127.0.0.1`
- Use a k3d loadbalancer with `--port` mappings
#### Step 2: Deploy Zitadel
Zitadel requires the topology to implement `Topology + K8sclient + HelmCommand + PostgreSQL`. The `K8sAnywhereTopology` satisfies all four.
```bash
cargo run -p example-zitadel
```
**What happens internally** (see `harmony/src/modules/zitadel/mod.rs`):
1. Creates `zitadel` namespace via `K8sResourceScore`
2. Deploys a CNPG PostgreSQL cluster:
- Name: `zitadel-pg`
- Instances: **2** (not 1)
- Storage: 10Gi
- Namespace: `zitadel`
3. Resolves the internal DB endpoint (`host:port`) from the CNPG cluster
4. Generates a 32-byte alphanumeric masterkey, stores it as Kubernetes Secret `zitadel-masterkey` (idempotent: skips if it already exists)
5. Generates a 16-char admin password (guaranteed 1+ uppercase, lowercase, digit, symbol)
6. Deploys Zitadel Helm chart (`zitadel/zitadel` from `https://charts.zitadel.com`):
- `chart_version: None` -- **uses latest chart version** (not pinned)
- No `--wait` flag -- returns before pods are ready
- Ingress annotations are **OpenShift-oriented** (`route.openshift.io/termination: edge`, `cert-manager.io/cluster-issuer: letsencrypt-prod`). On k3d these annotations are silently ignored.
- Ingress includes TLS config with `secretName: "{host}-tls"`, which requires cert-manager. Without cert-manager, TLS termination does not happen at the ingress level.
**Key Helm values set by ZitadelScore**:
- `zitadel.configmapConfig.ExternalDomain`: the `host` field (e.g., `sso.sto1.nationtech.io`)
- `zitadel.configmapConfig.ExternalSecure: true`
- `zitadel.configmapConfig.TLS.Enabled: false` (TLS at ingress, not in Zitadel)
- Admin user: `UserName: "admin"`, Email: **`admin@zitadel.example.com`** (hardcoded, not derived from host)
- Database credentials: injected via `env[].valueFrom.secretKeyRef` from secret `zitadel-pg-superuser` (both user and admin use the same superuser -- there is a TODO to fix this)
**Expected output**:
```
===== ZITADEL DEPLOYMENT COMPLETE =====
Login URL: https://sso.sto1.nationtech.io
Username: admin@zitadel.sso.sto1.nationtech.io
Password: <generated 16-char password>
```
**Note on the success message**: The printed username `admin@zitadel.{host}` does not match the actual configured email `admin@zitadel.example.com`. The actual login username in Zitadel is `admin` (the `UserName` field). This discrepancy exists in the current code.
**Validation on k3d**:
```bash
# Wait for pods to be ready (Helm returns before readiness)
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=zitadel -n zitadel --timeout=300s
# Port-forward to access Zitadel (ingress won't work without proper DNS/TLS on k3d)
kubectl port-forward svc/zitadel -n zitadel 8080:8080
# Access at http://localhost:8080 (note: ExternalSecure=true may cause redirect issues)
```
**Known issues for k3d deployment**:
- `ExternalSecure: true` tells Zitadel to expect HTTPS, but k3d port-forward is HTTP. This may cause redirect loops. Override with: modify the example to set `ExternalSecure: false` for local dev.
- The CNPG operator must be installed on the cluster. `K8sAnywhereTopology` handles this via the `PostgreSQL` trait implementation, which deploys the operator first.
#### Step 3: Deploy OpenBao
OpenBao requires only `Topology + K8sclient + HelmCommand` (no PostgreSQL dependency).
```bash
cargo run -p example-openbao
```
**What happens internally** (see `harmony/src/modules/openbao/mod.rs`):
1. `OpenbaoScore` directly delegates to `HelmChartScore.create_interpret()` -- there is no custom `execute()` logic, no namespace creation step, no secret generation
2. Deploys OpenBao Helm chart (`openbao/openbao` from `https://openbao.github.io/openbao-helm`):
- `chart_version: None` -- **uses latest chart version** (not pinned)
- `create_namespace: true` -- the `openbao` namespace is created by Helm
- `install_only: false` -- uses `helm upgrade --install`
**Exact Helm values set by OpenbaoScore**:
```yaml
global:
openshift: true # <-- PROBLEM: hardcoded, see below
server:
standalone:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = true
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "file" {
path = "/openbao/data"
}
service:
enabled: true
ingress:
enabled: true
hosts:
- host: <host field> # e.g., openbao.sebastien.sto1.nationtech.io
dataStorage:
enabled: true
size: 10Gi
storageClass: null # uses cluster default
accessMode: ReadWriteOnce
auditStorage:
enabled: true
size: 10Gi
storageClass: null
accessMode: ReadWriteOnce
ui:
enabled: true
```
**Critical issue: `global.openshift: true` is hardcoded.** The OpenBao Helm chart default is `global.openshift: false`. When set to `true`, the chart adjusts security contexts and may create OpenShift Routes instead of standard Kubernetes Ingress resources. **On k3d (vanilla k8s), this will produce resources that may not work correctly.** Before deploying on k3d, this must be overridden.
**Fix required for k3d**: Either:
1. Modify `OpenbaoScore` to accept an `openshift: bool` field (preferred long-term fix)
2. Or for this example, create a custom example that passes `values_overrides` with `global.openshift=false`
**Post-deployment initialization** (manual -- the TODO in `mod.rs` acknowledges this is not automated):
OpenBao starts in a sealed state. You must initialize and unseal it manually. See https://openbao.org/docs/platform/k8s/helm/run/
```bash
# Initialize OpenBao (generates unseal keys + root token)
kubectl exec -n openbao openbao-0 -- bao operator init
# Save the output! It contains 5 unseal keys and the root token.
# Example output:
# Unseal Key 1: abc123...
# Unseal Key 2: def456...
# ...
# Initial Root Token: hvs.xxxxx
# Unseal (requires 3 of 5 keys by default)
kubectl exec -n openbao openbao-0 -- bao operator unseal <key1>
kubectl exec -n openbao openbao-0 -- bao operator unseal <key2>
kubectl exec -n openbao openbao-0 -- bao operator unseal <key3>
```
**Validation**:
```bash
kubectl exec -n openbao openbao-0 -- bao status
# Should show "Sealed: false"
```
**Note**: The ingress has **no TLS configuration** (unlike Zitadel's ingress). Access is HTTP-only unless you configure TLS separately.
#### Step 4: Configure OpenBao for Harmony
Two paths are available depending on the authentication method:
##### Path A: Userpass auth (simpler, for local dev)
The current `OpenbaoSecretStore` supports **token** and **userpass** authentication. It does NOT yet implement the JWT/OIDC device flow described in ADR 020-1.
```bash
# Port-forward to access OpenBao API
kubectl port-forward svc/openbao -n openbao 8200:8200 &
export BAO_ADDR="http://127.0.0.1:8200"
export BAO_TOKEN="<root token from init>"
# Enable KV v2 secrets engine (default mount "secret")
bao secrets enable -path=secret kv-v2
# Enable userpass auth method
bao auth enable userpass
# Create a user for Harmony
bao write auth/userpass/login/harmony password="harmony-dev-password"
# Create policy granting read/write on harmony/* paths
cat <<'EOF' | bao policy write harmony-dev -
path "secret/data/harmony/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
path "secret/metadata/harmony/*" {
capabilities = ["list", "read", "delete"]
}
EOF
# Create the user with the policy attached
bao write auth/userpass/users/harmony \
password="harmony-dev-password" \
policies="harmony-dev"
```
**Bug in `OpenbaoSecretStore::authenticate_userpass()`**: The `kv_mount` parameter (default `"secret"`) is passed to `vaultrs::auth::userpass::login()` as the auth mount path. This means it calls `POST /v1/auth/secret/login/{username}` instead of the correct `POST /v1/auth/userpass/login/{username}`. **The auth mount and KV mount are conflated into one parameter.**
**Workaround**: Set `OPENBAO_KV_MOUNT=userpass` so the auth call hits the correct mount path. But then KV operations would use mount `userpass` instead of `secret`, which is wrong.
**Proper fix needed**: Split `kv_mount` into two separate parameters: one for the KV v2 engine mount (`secret`) and one for the auth mount (`userpass`). This is a bug in `harmony_secret/src/store/openbao.rs:234`.
**For this example**: Use **token auth** instead of userpass to sidestep the bug:
```bash
# Set env vars for the example
export OPENBAO_URL="http://127.0.0.1:8200"
export OPENBAO_TOKEN="<root token from init>"
export OPENBAO_KV_MOUNT="secret"
```
##### Path B: JWT auth with Zitadel (target architecture, per ADR 020-1)
This is the production path described in the ADR. It requires the device flow code that is **not yet implemented** in `OpenbaoSecretStore`. The current code only supports token and userpass.
When implemented, the flow will be:
1. Enable JWT auth method in OpenBao
2. Configure it to trust Zitadel's OIDC discovery URL
3. Create a role that maps Zitadel JWT claims to OpenBao policies
```bash
# Enable JWT auth
bao auth enable jwt
# Configure JWT auth to trust Zitadel
bao write auth/jwt/config \
oidc_discovery_url="https://<zitadel-host>" \
bound_issuer="https://<zitadel-host>"
# Create role for Harmony developers
bao write auth/jwt/role/harmony-developer \
role_type="jwt" \
bound_audiences="<harmony_client_id>" \
user_claim="email" \
groups_claim="urn:zitadel:iam:org:project:roles" \
policies="harmony-dev" \
ttl="4h" \
max_ttl="24h" \
token_type="service"
```
**Zitadel application setup** (in Zitadel console):
1. Create project: `Harmony`
2. Add application: `Harmony CLI` (Native app type)
3. Enable Device Authorization grant type
4. Set scopes: `openid email profile offline_access`
5. Note the `client_id`
This path is deferred until the device flow is implemented in `OpenbaoSecretStore`.
#### Step 5: Write end-to-end example
The example uses `StoreSource<OpenbaoSecretStore>` with token auth to avoid the userpass mount bug.
**Environment variables required** (from `harmony_secret/src/config.rs`):
| Variable | Required | Default | Notes |
|---|---|---|---|
| `OPENBAO_URL` | Yes | None | Falls back to `VAULT_ADDR` |
| `OPENBAO_TOKEN` | For token auth | None | Root or user token |
| `OPENBAO_USERNAME` | For userpass | None | Requires `OPENBAO_PASSWORD` too |
| `OPENBAO_PASSWORD` | For userpass | None | |
| `OPENBAO_KV_MOUNT` | No | `"secret"` | KV v2 engine mount path. **Also used as userpass auth mount -- this is a bug.** |
| `OPENBAO_SKIP_TLS` | No | `false` | Set `"true"` to disable TLS verification |
**Note**: `OpenbaoSecretStore::new()` is `async` and **requires a running OpenBao** at construction time (it validates the token if using cached auth). If OpenBao is unreachable during construction, the call will fail. The graceful fallback only applies to `StoreSource::get()` calls after construction -- the `ConfigManager` must be built with a live store, or the store must be wrapped in a lazy initialization pattern.
```rust
// harmony_config/examples/openbao_chain.rs
use harmony_config::{ConfigManager, EnvSource, SqliteSource, StoreSource};
use harmony_secret::OpenbaoSecretStore;
use serde::{Deserialize, Serialize};
use std::sync::Arc;
#[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq)]
struct AppConfig {
host: String,
port: u16,
}
impl harmony_config::Config for AppConfig {
const KEY: &'static str = "AppConfig";
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
env_logger::init();
// Build the source chain
let env_source: Arc<dyn harmony_config::ConfigSource> = Arc::new(EnvSource);
let sqlite = Arc::new(
SqliteSource::default()
.await
.expect("Failed to open SQLite"),
);
// OpenBao store -- requires OPENBAO_URL and OPENBAO_TOKEN env vars
// Falls back gracefully if OpenBao is unreachable at query time
let openbao_url = std::env::var("OPENBAO_URL")
.or(std::env::var("VAULT_ADDR"))
.ok();
let sources: Vec<Arc<dyn harmony_config::ConfigSource>> = if let Some(url) = openbao_url {
let kv_mount = std::env::var("OPENBAO_KV_MOUNT")
.unwrap_or_else(|_| "secret".to_string());
let skip_tls = std::env::var("OPENBAO_SKIP_TLS")
.map(|v| v == "true")
.unwrap_or(false);
match OpenbaoSecretStore::new(
url,
kv_mount,
skip_tls,
std::env::var("OPENBAO_TOKEN").ok(),
std::env::var("OPENBAO_USERNAME").ok(),
std::env::var("OPENBAO_PASSWORD").ok(),
)
.await
{
Ok(store) => {
let store_source = Arc::new(StoreSource::new("harmony".to_string(), store));
vec![env_source, Arc::clone(&sqlite) as _, store_source]
}
Err(e) => {
eprintln!("Warning: OpenBao unavailable ({e}), using local sources only");
vec![env_source, sqlite]
}
}
} else {
println!("No OPENBAO_URL set, using local sources only");
vec![env_source, sqlite]
};
let manager = ConfigManager::new(sources);
// Scenario 1: get() with nothing stored -- returns NotFound
let result = manager.get::<AppConfig>().await;
println!("Get (empty): {:?}", result);
// Scenario 2: set() then get()
let config = AppConfig {
host: "production.example.com".to_string(),
port: 443,
};
manager.set(&config).await?;
println!("Set: {:?}", config);
let retrieved = manager.get::<AppConfig>().await?;
println!("Get (after set): {:?}", retrieved);
assert_eq!(config, retrieved);
println!("End-to-end chain validated!");
Ok(())
}
```
**Key behaviors demonstrated**:
1. **Graceful construction fallback**: If `OPENBAO_URL` is not set or OpenBao is unreachable at startup, the chain is built without it
2. **Graceful query fallback**: `StoreSource::get()` returns `Ok(None)` on any error, so the chain continues to SQLite
3. **Environment override**: `HARMONY_CONFIG_AppConfig='{"host":"env-host","port":9090}'` bypasses all backends
#### Step 6: Validate graceful fallback
Already validated via unit tests (26 tests pass):
- `test_store_source_error_falls_through_to_sqlite` -- `StoreSource` with `AlwaysErrorStore` returns connection error, chain falls through to `SqliteSource`
- `test_store_source_not_found_falls_through_to_sqlite` -- `StoreSource` returns `NotFound`, chain falls through to `SqliteSource`
**Code path (FIXED in `harmony_config/src/source/store.rs`)**:
```rust
// StoreSource::get() -- returns Ok(None) on ANY error, allowing chain to continue
match self.store.get_raw(&self.namespace, key).await {
Ok(bytes) => { /* deserialize and return */ Ok(Some(value)) }
Err(SecretStoreError::NotFound { .. }) => Ok(None),
Err(_) => Ok(None), // Connection errors, timeouts, etc.
}
```
#### Step 7: Known issues and blockers
| Issue | Location | Severity | Status |
|---|---|---|---|
| `global.openshift: true` hardcoded | `harmony/src/modules/openbao/mod.rs:32` | **Blocker for k3d** | ✅ Fixed: Added `openshift: bool` field to `OpenbaoScore` (defaults to `false`) |
| `kv_mount` used as auth mount path | `harmony_secret/src/store/openbao.rs:234` | **Bug** | ✅ Fixed: Added separate `auth_mount` parameter; added `OPENBAO_AUTH_MOUNT` env var |
| Admin email hardcoded `admin@zitadel.example.com` | `harmony/src/modules/zitadel/mod.rs:314` | Minor | Cosmetic mismatch with success message |
| `ExternalSecure: true` hardcoded | `harmony/src/modules/zitadel/mod.rs:306` | **Issue for k3d** | ✅ Fixed: Zitadel now detects Kubernetes distribution and uses appropriate settings (OpenShift = TLS + cert-manager annotations, k3d = plain nginx ingress without TLS) |
| No Helm chart version pinning | Both modules | Risk | Non-deterministic deploys |
| No `--wait` on Helm install | `harmony/src/modules/helm/chart.rs` | UX | Must manually wait for readiness |
| `get_version()`/`get_status()` are `todo!()` | Both modules | Panic risk | Do not call these methods |
| JWT/OIDC device flow not implemented | `harmony_secret/src/store/openbao.rs` | **Gap** | ✅ Implemented: `ZitadelOidcAuth` in `harmony_secret/src/store/zitadel.rs` |
| `HARMONY_SECRET_NAMESPACE` panics if not set | `harmony_secret/src/config.rs:5` | Runtime panic | Only affects `SecretManager`, not `StoreSource` directly |
**Remaining work**:
- [x] `StoreSource<OpenbaoSecretStore>` integration validates compilation
- [x] StoreSource returns `Ok(None)` on connection error (not `Err`)
- [x] Graceful fallback tests pass when OpenBao is unreachable (2 new tests)
- [x] Fix `global.openshift: true` in `OpenbaoScore` for k3d compatibility
- [x] Fix `kv_mount` / auth mount conflation bug in `OpenbaoSecretStore`
- [x] Create and test `harmony_config/examples/openbao_chain.rs` against real k3d deployment
- [x] Implement JWT/OIDC device flow in `OpenbaoSecretStore` (ADR 020-1) — `ZitadelOidcAuth` implemented and wired into `OpenbaoSecretStore::new()` auth chain
- [x] Fix Zitadel distribution detection — Zitadel now uses `k8s_client.get_k8s_distribution()` to detect OpenShift vs k3d and applies appropriate Helm values (TLS + cert-manager for OpenShift, plain nginx for k3d)
### 1.5 UX validation checklist ⏳
@@ -153,8 +602,8 @@ Remaining work:
- [x] Fix `get_or_prompt` to persist to first writable source (via `should_persist()`), not all sources
- [x] Integration tests for full resolution chain
- [x] Branch-switching deserialization failure test
- [ ] `StoreSource<OpenbaoSecretStore>` integration validated (compiles, graceful fallback)
- [ ] ADR for Zitadel OIDC target architecture
- [x] `StoreSource<OpenbaoSecretStore>` integration validated (compiles, graceful fallback)
- [x] ADR for Zitadel OIDC target architecture
- [ ] Update docs to reflect final implementation and behavior
## Key Implementation Notes
@@ -168,3 +617,7 @@ Remaining work:
4. **Env var precedence**: Environment variables always take precedence over SQLite in the resolution chain
5. **Testing**: All tests use `tempfile::NamedTempFile` for temporary database paths, ensuring test isolation
6. **Graceful fallback**: `StoreSource::get()` returns `Ok(None)` on any error (connection refused, timeout, etc.), allowing the chain to fall through to the next source. This ensures OpenBao unavailability doesn't break the config chain.
7. **StoreSource errors don't block chain**: When OpenBao is unreachable, `StoreSource::get()` returns `Ok(None)` and the `ConfigManager` continues to the next source (typically `SqliteSource`). This is validated by `test_store_source_error_falls_through_to_sqlite` and `test_store_source_not_found_falls_through_to_sqlite`.

View File

@@ -0,0 +1,57 @@
# Phase 7: OPNsense & Bare-Metal Network Automation
## Goal
Complete the OPNsense API coverage and Brocade switch integration to enable fully automated bare-metal HA cluster provisioning with LAGG, CARP VIP, multi-WAN, and BINAT.
## Status: In Progress
### Done
- opnsense-codegen pipeline: XML model parsing, IR generation, Rust code generation with serde helpers
- 11 generated API modules covering firewall, interfaces (VLAN, LAGG, VIP), HAProxy, DNSMasq, Caddy, WireGuard
- 9 OPNsense Scores: VlanScore, LaggScore, VipScore, DnatScore, FirewallRuleScore, OutboundNatScore, BinatScore, NodeExporterScore, OPNsenseShellCommandScore
- 13 opnsense-config modules with high-level Rust APIs
- E2E tests for DNSMasq CRUD, HAProxy service lifecycle, interface settings
- Brocade branch with VLAN CRUD, interface speed config, port-channel management
### Remaining
#### UpdateHostScore (new)
A Score that updates a host's configuration in the DHCP server and prepares it for PXE boot. Core responsibilities:
1. **Update MAC address in DHCP**: When hardware is replaced or NICs are swapped, update the DHCP static mapping with the new MAC address(es). This is the most critical function — without it, PXE boot targets the wrong hardware.
2. **Configure PXE boot options**: Set next-server, boot filename (BIOS/UEFI/iPXE) for the specific host.
3. **Host network setup for LAGG LACP 802.3ad**: Configure the host's network interfaces for link aggregation. This replaces the current `HostNetworkConfigurationScore` approach which only handles bond creation on the host side — the new approach must also create the corresponding LAGG interface on OPNsense and configure the Brocade switch port-channel with LACP.
The existing `DhcpHostBindingScore` handles bulk MAC-to-IP registration but lacks the ability to _update_ an existing mapping (the `remove_static_mapping` and `list_static_mappings` methods on `OPNSenseFirewall` are still `todo!()`).
#### Merge Brocade branch
The `feat/brocade-client-add-vlans` branch has breaking API changes:
- `configure_interfaces` now takes `Vec<InterfaceConfig>` instead of `Vec<(String, PortOperatingMode)>`
- `InterfaceType` changed from `Ethernet(String)` to specific variants (TenGigabitEthernet, FortyGigabitEthernet)
- `harmony/src/infra/brocade.rs` needs adaptation to the new API
#### HostNetworkConfigurationScore rework
The current implementation (`harmony/src/modules/okd/host_network.rs`) has documented limitations:
- Not idempotent (running twice may duplicate bond configs)
- No rollback logic
- Doesn't wait for switch config propagation
- All tests are `#[ignore]` due to requiring interactive TTY (inquire prompts)
- Doesn't create LAGG on OPNsense — only bonds on the host and port-channels on the switch
For LAGG LACP 802.3ad the flow needs to be:
1. Create LAGG interface on OPNsense (LaggScore already exists)
2. Create port-channel on Brocade switch (BrocadeSwitchConfigurationScore)
3. Configure bond on host via NMState (existing NetworkManager)
4. All three must be coordinated and idempotent
#### Fill remaining OPNsense `todo!()` stubs
- `OPNSenseFirewall::remove_static_mapping` — needed by UpdateHostScore
- `OPNSenseFirewall::list_static_mappings` — needed for idempotent updates
- `OPNSenseFirewall::Firewall` trait (add_rule, remove_rule, list_rules) — stub only
- `OPNSenseFirewall::dns::register_dhcp_leases` — stub only

View File

@@ -0,0 +1,56 @@
# Phase 8: HA OKD Production Deployment
## Goal
Deploy a production HAClusterTopology OKD cluster in UPI mode with full LAGG LACP 802.3ad, CARP VIP, multi-WAN, and BINAT for customer traffic — entirely automated through Harmony Scores.
## Status: Not Started
## Prerequisites
- Phase 7 (OPNsense & Bare-Metal) substantially complete
- Brocade branch merged and adapted
- UpdateHostScore implemented and tested
## Deployment Stack
### Network Layer (OPNsense)
- **LAGG interfaces** (802.3ad LACP) for all cluster hosts — redundant links via LaggScore
- **CARP VIPs** for high availability — failover IPs via VipScore
- **Multi-WAN** configuration — multiple uplinks with gateway groups
- **BINAT** for customer-facing IPs — 1:1 NAT via BinatScore
- **Firewall rules** per-customer with proper source/dest filtering via FirewallRuleScore
- **Outbound NAT** for cluster egress via OutboundNatScore
### Switch Layer (Brocade)
- **VLAN** per network segment (management, cluster, customer, storage)
- **Port-channels** (LACP) matching OPNsense LAGG interfaces
- **Interface speed** configuration for 10G/40G links
### Host Layer
- **PXE boot** via UpdateHostScore (MAC → DHCP → TFTP → iPXE → SCOS)
- **Network bonds** (LACP) via reworked HostNetworkConfigurationScore
- **NMState** for persistent bond configuration on OpenShift nodes
### Cluster Layer
- OKD UPI installation via existing OKDSetup01-04 Scores
- HAProxy load balancer for API and ingress via LoadBalancerScore
- DNS via OKDDnsScore
- Monitoring via NodeExporterScore + Prometheus stack
## New Scores Needed
1. **UpdateHostScore** — Update MAC in DHCP, configure PXE boot, prepare host network for LAGG LACP
2. **MultiWanScore** — Configure OPNsense gateway groups for multi-WAN failover
3. **CustomerBinatScore** (optional) — Higher-level Score combining BinatScore + FirewallRuleScore + DnatScore per customer
## Validation Checklist
- [ ] All hosts PXE boot successfully after MAC update
- [ ] LAGG/LACP active on all host links (verify via `teamdctl` or `nmcli`)
- [ ] CARP VIPs fail over within expected time window
- [ ] BINAT customers reachable from external networks
- [ ] Multi-WAN failover tested (pull one uplink, verify traffic shifts)
- [ ] Full OKD installation completes end-to-end
- [ ] Cluster API accessible via CARP VIP
- [ ] Customer workloads routable via BINAT

View File

@@ -0,0 +1,125 @@
# Phase 9: SSO + Config System Hardening
## Goal
Make the Zitadel + OpenBao SSO config management stack production-ready, well-tested, and reusable across deployments. The `harmony_sso` example demonstrates the full loop: deploy infrastructure, authenticate via SSO, store and retrieve config -- all in one `cargo run`.
## Current State (as of `feat/opnsense-codegen`)
The SSO example works end-to-end:
- k3d cluster + OpenBao + Zitadel deployed via Scores
- `OpenbaoSetupScore`: init, unseal, policies, userpass, JWT auth
- `ZitadelSetupScore`: project + device-code app provisioning via Management API (PAT auth)
- JWT exchange: Zitadel id_token → OpenBao client token via `/v1/auth/jwt/login`
- Device flow triggers in terminal, user logs in via browser, config stored in OpenBao KV v2
- CoreDNS patched for in-cluster hostname resolution (K3sFamily only)
- Discovery cache invalidation after CRD installation
- Session caching with TTL
### What's solid
- **Score composition**: 4 Scores orchestrate the full stack in ~280 lines
- **Config trait**: clean `Serialize + Deserialize + JsonSchema`, developer doesn't see OpenBao or Zitadel
- **Auth chain transparency**: token → cached → OIDC device flow → userpass, right thing happens
- **Idempotency**: all Scores safe to re-run, cached sessions skip login
### What needs work
See tasks below.
## Tasks
### 9.1 Builder pattern for `OpenbaoSecretStore` — HIGH
**Problem**: `OpenbaoSecretStore::new()` has 11 positional arguments. Adding JWT params made it worse. Callers pass `None, None, None, None` for unused options.
**Fix**: Replace with a builder:
```rust
OpenbaoSecretStore::builder()
.url("http://127.0.0.1:8200")
.kv_mount("secret")
.skip_tls(true)
.zitadel_sso("http://sso.harmony.local:8080", "client-id-123")
.jwt_auth("harmony-developer", "jwt")
.build()
.await?
```
**Impact**: All callers updated (lib.rs, openbao_chain example, harmony_sso example). Breaking API change.
**Files**: `harmony_secret/src/store/openbao.rs`, all callers
### 9.2 Fix ZitadelScore PG readiness — HIGH
**Problem**: `ZitadelScore` calls `topology.get_endpoint()` immediately after deploying the CNPG Cluster CR. The PG `-rw` service takes 15-30s to appear. This forces a retry loop in the caller (the example).
**Fix**: Add a wait loop inside `ZitadelScore`'s interpret, after `topology.deploy(&pg_config)`, that polls for the `-rw` service to exist before calling `get_endpoint()`. Use `K8sClient::get_resource::<Service>()` with a poll loop.
**Impact**: Eliminates the retry wrapper in the harmony_sso example and any other Zitadel consumer.
**Files**: `harmony/src/modules/zitadel/mod.rs`
### 9.3 `CoreDNSRewriteScore` — MEDIUM
**Problem**: CoreDNS patching logic lives in the harmony_sso example. It's a general pattern: any service with ingress-based Host routing needs in-cluster DNS resolution.
**Fix**: Extract into `harmony/src/modules/k8s/coredns.rs` as a proper Score:
```rust
pub struct CoreDNSRewriteScore {
pub rewrites: Vec<(String, String)>, // (hostname, service FQDN)
}
impl<T: Topology + K8sclient> Score<T> for CoreDNSRewriteScore { ... }
```
K3sFamily only. No-op on OpenShift. Idempotent.
**Files**: `harmony/src/modules/k8s/coredns.rs` (new), `harmony/src/modules/k8s/mod.rs`
### 9.4 Integration tests for Scores — MEDIUM
**Problem**: Zero tests for `OpenbaoSetupScore`, `ZitadelSetupScore`, `CoreDNSRewriteScore`. The Scores are testable against a running k3d cluster.
**Fix**: Add `#[ignore]` integration tests that require a running cluster:
- `test_openbao_setup_score`: deploy OpenBao + run setup, verify KV works
- `test_zitadel_setup_score`: deploy Zitadel + run setup, verify project/app exist
- `test_config_round_trip`: store + retrieve config via SSO-authenticated OpenBao
Run with `cargo test -- --ignored` after deploying the example.
**Files**: `harmony/tests/integration/` (new directory)
### 9.5 Remove `resolve()` DNS hack — LOW
**Problem**: `ZitadelOidcAuth::http_client()` hardcodes `resolve(host, 127.0.0.1:port)`. This only works for local k3d development.
**Fix**: Make it configurable. Add an optional `resolve_to: Option<SocketAddr>` field to `ZitadelOidcAuth`. The example passes `Some(127.0.0.1:8080)` for k3d; production passes `None` (uses real DNS). Or better: detect whether the host resolves and only apply the override if it doesn't.
**Files**: `harmony_secret/src/store/zitadel.rs`
### 9.6 Typed Zitadel API client — LOW
**Problem**: `ZitadelSetupScore` uses hand-written JSON with string parsing for Management API calls. No type safety on request/response.
**Fix**: Create typed request/response structs for the Management API v1 endpoints used (projects, apps, users). Use `serde` for serialization. This doesn't need to be a full API client -- just the endpoints we use.
**Files**: `harmony/src/modules/zitadel/api.rs` (new)
### 9.7 Capability traits for secret vault + identity — FUTURE
**Problem**: `OpenbaoScore` and `ZitadelScore` are tool-specific. No capability abstraction for "I need a secret vault" or "I need an identity provider".
**Fix**: Design `SecretVault` and `IdentityProvider` capability traits on topologies. This is a significant architectural decision that needs an ADR.
**Blocked by**: Real-world use of a second implementation (e.g., HashiCorp Vault, Keycloak) to validate the abstraction boundary.
### 9.8 Auto-unseal for OpenBao — FUTURE
**Problem**: Every pod restart requires manual unseal. `OpenbaoSetupScore` handles this, but requires re-running the Score.
**Fix**: Configure Transit auto-unseal (using a second OpenBao/Vault instance) or cloud KMS auto-unseal. This is an operational concern that should be configurable in `OpenbaoSetupScore`.
## Relationship to Other Phases
- **Phase 1** (config crate): SSO flow builds directly on `harmony_config` + `StoreSource<OpenbaoSecretStore>`. Phase 1 task 1.4 is now **complete** via the harmony_sso example.
- **Phase 2** (migrate to harmony_config): The 19 `SecretManager` call sites should migrate to `ConfigManager` with the OpenbaoSecretStore backend. The SSO flow validates this pattern works.
- **Phase 5** (E2E tests): The harmony_sso example is a candidate for the first E2E test -- it deploys k3d, exercises multiple Scores, and verifies config storage.

View File

@@ -0,0 +1,49 @@
# Phase 10: Firewall Pair Topology & HA Firewall Automation
## Goal
Provide first-class support for managing OPNsense (and future) HA firewall pairs through a higher-order topology, including CARP VIP orchestration, per-device config differentiation, and integration testing.
## Current State
`FirewallPairTopology` is implemented as a concrete wrapper around two `OPNSenseFirewall` instances. It applies uniform scores to both firewalls and differentiates CARP VIP advskew (primary=0, backup=configurable). All existing OPNsense scores (Lagg, Vlan, Firewall Rules, DNAT, BINAT, Outbound NAT, DHCP) work with the pair topology. QC1 uses it for its NT firewall pair.
## Tasks
### 10.1 Generic FirewallPair over a capability trait
**Priority**: MEDIUM
**Status**: Not started
`FirewallPairTopology` is currently concrete over `OPNSenseFirewall`. This breaks extensibility — a pfSense or VyOS firewall pair would need a separate type. Introduce a `FirewallAppliance` capability trait that `OPNSenseFirewall` implements, and make `FirewallPairTopology<T: FirewallAppliance>` generic. The blanket-impl pattern from ADR-015 then gives automatic pair support for any appliance type.
Key challenge: the trait needs to expose enough for `CarpVipScore` to configure VIPs with per-device advskew, without leaking OPNsense-specific APIs.
### 10.2 Delegation macro for higher-order topologies
**Priority**: MEDIUM
**Status**: Not started
The "delegate to both" pattern used by uniform pair scores is pure boilerplate. Every `Score<FirewallPairTopology>` impl for uniform scores follows the same structure: create the inner `Score<OPNSenseFirewall>` interpret, execute against primary, then backup.
Design a proc macro (e.g., `#[derive(DelegatePair)]` or `delegate_score_to_pair!`) that generates these impls automatically. This would also apply to `DecentralizedTopology` (delegate to all sites) and future higher-order topologies.
### 10.3 XMLRPC sync support
**Priority**: LOW
**Status**: Not started
Add optional `FirewallPairTopology::sync_from_primary()` that triggers OPNsense XMLRPC config sync from primary to backup. Useful for settings that must be identical and don't need per-device differentiation. Not blocking — independent application to both firewalls achieves the same config state.
### 10.4 Integration test with CARP/LACP failover
**Priority**: LOW
**Status**: Not started
Extend the existing OPNsense example deployment to create a firewall pair test fixture:
- Two OPNsense VMs in CARP configuration
- A third VM as a client verifying connectivity
- Automated failover testing: disconnect primary's virtual NIC, verify CARP failover to backup, reconnect, verify failback
- LACP failover: disconnect one LAGG member, verify traffic continues on remaining member
This builds on the KVM test harness from Phase 6.

View File

@@ -0,0 +1,77 @@
# Phase 11: Named Config Instances & Cross-Namespace Access
## Goal
Allow multiple instances of the same config type within a single namespace, identified by name. Also allow explicit namespace specification when retrieving config items, enabling cross-deployment orchestration.
## Context
The current `harmony_config` system identifies config items by type only (`T::KEY` from `#[derive(Config)]`). This works for singletons but breaks when you need multiple instances of the same type:
- **Firewall pair**: primary and backup need separate `OPNSenseApiCredentials` (different API keys for different devices)
- **Worker nodes**: each BMC has its own `IpmiCredentials` with different username/password
- **Firewall administrators**: multiple `OPNSenseApiCredentials` with different permission levels
- **Multi-tenant**: customer firewalls vs. NationTech infrastructure firewalls need separate credential sets
Using separate namespaces per device is not the answer — a firewall pair belongs to a single deployment, and forcing namespace switches for each device in a pair adds unnecessary friction.
Cross-namespace access is a separate but related need: the NT firewall pair and C1 customer firewall pair live in separate namespaces (the customer manages their own firewall), but NationTech needs read access to the C1 namespace for BINAT coordination.
## Tasks
### 11.1 Named config instances within a namespace
**Priority**: HIGH
**Status**: Not started
Extend the `Config` trait and `ConfigManager` to support an optional instance name:
```rust
// Current (singleton): gets "OPNSenseApiCredentials" from the active namespace
let creds = ConfigManager::get::<OPNSenseApiCredentials>().await?;
// New (named): gets "OPNSenseApiCredentials/fw-primary" from the active namespace
let primary_creds = ConfigManager::get_named::<OPNSenseApiCredentials>("fw-primary").await?;
let backup_creds = ConfigManager::get_named::<OPNSenseApiCredentials>("fw-backup").await?;
```
Storage key becomes `{T::KEY}/{instance_name}` (or similar). The unnamed `get()` remains unchanged for backward compatibility.
This needs to work across all config sources:
- `EnvSource`: `HARMONY_CONFIG_{KEY}_{NAME}` (e.g., `HARMONY_CONFIG_OPNSENSE_API_CREDENTIALS_FW_PRIMARY`)
- `SqliteSource`: composite key `{key}/{name}`
- `StoreSource` (OpenBao): path `{namespace}/{key}/{name}`
- `PromptSource`: prompt includes the instance name for clarity
### 11.2 Cross-namespace config access
**Priority**: MEDIUM
**Status**: Not started
Allow specifying an explicit namespace when retrieving a config item:
```rust
// Get from the active namespace (current behavior)
let nt_creds = ConfigManager::get::<OPNSenseApiCredentials>().await?;
// Get from a specific namespace
let c1_creds = ConfigManager::get_from_namespace::<OPNSenseApiCredentials>("c1").await?;
```
This enables orchestration across deployments: the NT deployment can read C1's firewall credentials for BINAT coordination without switching the global namespace.
For the `StoreSource` (OpenBao), this maps to reading from a different KV path prefix. For `SqliteSource`, it maps to a different database file or a namespace column. For `EnvSource`, it could use a different prefix (`HARMONY_CONFIG_C1_{KEY}`).
### 11.3 Update FirewallPairTopology to use named configs
**Priority**: MEDIUM
**Status**: Blocked by 11.1
Once named config instances are available, update `FirewallPairTopology::opnsense_from_config()` to use them:
```rust
let primary_creds = ConfigManager::get_named::<OPNSenseApiCredentials>("fw-primary").await?;
let backup_creds = ConfigManager::get_named::<OPNSenseApiCredentials>("fw-backup").await?;
```
This removes the current limitation of shared credentials between primary and backup.

View File

@@ -13,6 +13,7 @@ If you're new to Harmony, start here:
See how to use Harmony to solve real-world problems.
- [**OPNsense VM Integration**](./use-cases/opnsense-vm-integration.md): Boot a real OPNsense firewall in a local KVM VM and configure it entirely through Harmony. Fully automated, zero manual steps — the flashiest demo. Requires Linux with KVM.
- [**PostgreSQL on Local K3D**](./use-cases/postgresql-on-local-k3d.md): Deploy a production-grade PostgreSQL cluster on a local K3D cluster. The fastest way to get started.
- [**OKD on Bare Metal**](./use-cases/okd-on-bare-metal.md): A detailed walkthrough of bootstrapping a high-availability OKD cluster from physical hardware.

View File

@@ -8,6 +8,7 @@
## Use Cases
- [PostgreSQL on Local K3D](./use-cases/postgresql-on-local-k3d.md)
- [OPNsense VM Integration](./use-cases/opnsense-vm-integration.md)
- [OKD on Bare Metal](./use-cases/okd-on-bare-metal.md)
## Component Catalogs

View File

@@ -0,0 +1,181 @@
# Harmony Architecture — Three Open Challenges
Three problems that, if solved well, would make Harmony the most capable infrastructure automation framework in existence.
## 1. Topology Evolution During Deployment
### The problem
A bare-metal OKD deployment is a multi-hour process where the infrastructure's capabilities change as the deployment progresses:
```
Phase 0: Network only → OPNsense reachable, Brocade reachable, no hosts
Phase 1: Discovery → PXE boots work, hosts appear via mDNS, no k8s
Phase 2: Bootstrap → openshift-install running, API partially available
Phase 3: Control plane → k8s API available, operators converging, no workers
Phase 4: Workers → Full cluster, apps can be deployed
Phase 5: Day-2 → Monitoring, alerting, tenant onboarding
```
Today, `HAClusterTopology` implements _all_ capability traits from the start. If a Score calls `k8s_client()` during Phase 0, it hits `DummyInfra` which panics. The type system says "this is valid" but the runtime says "this will crash."
### Why it matters
- Scores that require k8s compile and register happily at Phase 0, then panic if accidentally executed too early
- The pipeline is ordered by convention (Stage 01 → 02 → 03 → ...) but nothing enforces that Stage 04 can't run before Stage 02
- Adding new capabilities (like "cluster has monitoring installed") requires editing the topology struct, not declaring the capability was acquired
### Design direction
The topology should evolve through **phases** where capabilities are _acquired_, not assumed. Two possible approaches:
**A. Phase-gated topology (runtime)**
The topology tracks which phase it's in. Capability methods check the phase before executing and return a meaningful error instead of panicking:
```rust
impl K8sclient for HAClusterTopology {
async fn k8s_client(&self) -> Result<Arc<K8sClient>, String> {
if self.phase < Phase::ControlPlaneReady {
return Err("k8s API not available yet (current phase: {})".into());
}
// ... actual implementation
}
}
```
Scores that fail due to phase mismatch get a clear error message, not a panic. The Maestro can validate phase requirements before executing a Score.
**B. Typestate topology (compile-time)**
Use Rust's type system to make invalid phase transitions unrepresentable:
```rust
struct Topology<P: Phase> { ... }
impl Topology<NetworkReady> {
fn bootstrap(self) -> Topology<Bootstrapping> { ... }
}
impl Topology<Bootstrapping> {
fn promote(self) -> Topology<ClusterReady> { ... }
}
// Only ClusterReady implements K8sclient
impl K8sclient for Topology<ClusterReady> { ... }
```
This is the "correct" Rust approach but requires significant refactoring and may be too rigid for real deployments where phases overlap.
**Recommendation**: Start with (A) — runtime phase tracking. It's additive (no breaking changes), catches the DummyInfra panic problem immediately, and provides the data needed for (B) later.
---
## 2. Runtime Plan & Validation Phase
### The problem
Harmony validates Scores at compile time: if a Score requires `DhcpServer + TftpServer`, the topology must implement both traits or the program won't compile. This is powerful but insufficient.
What compile-time _cannot_ check:
- Is the OPNsense API actually reachable right now?
- Does VLAN 100 already exist (so we can skip creating it)?
- Is there already a DHCP entry for this MAC address?
- Will this firewall rule conflict with an existing one?
- Is there enough disk space on the TFTP server for the boot images?
Today, these are discovered at execution time, deep inside an Interpret's `execute()` method. A failure at minute 45 of a deployment is expensive.
### Why it matters
- No way to preview what Harmony will do before it does it
- No way to detect conflicts or precondition failures early
- Operators must read logs to understand what happened — there's no structured "here's what I did" report
- Re-running a deployment is scary because you don't know what will be re-applied vs skipped
### Design direction
Add a **validate** phase to the Score/Interpret lifecycle:
```rust
#[async_trait]
pub trait Interpret<T>: Debug + Send {
/// Check preconditions and return what this interpret WOULD do.
/// Default implementation returns "will execute" (opt-in validation).
async fn validate(
&self,
inventory: &Inventory,
topology: &T,
) -> Result<ValidationReport, InterpretError> {
Ok(ValidationReport::will_execute(self.get_name()))
}
/// Execute the interpret (existing method, unchanged).
async fn execute(
&self,
inventory: &Inventory,
topology: &T,
) -> Result<Outcome, InterpretError>;
// ... existing methods
}
```
A `ValidationReport` would contain:
- **Status**: `WillCreate`, `WillUpdate`, `WillDelete`, `AlreadyApplied`, `Blocked(reason)`
- **Details**: human-readable description of planned changes
- **Preconditions**: list of checks performed and their results
The Maestro would run validation for all registered Scores before executing any of them, producing a plan that the operator reviews.
This is opt-in: Scores that don't implement `validate()` get a default "will execute" report. Over time, each Score adds validation logic. The OPNsense Scores are ideal first candidates since they can query current state via the API.
### Relationship to state
This approach does _not_ require a state file. Validation queries the infrastructure directly — the same philosophy Harmony already follows. The "plan" is computed fresh every time by asking the infrastructure what exists right now.
---
## 3. TUI as Primary Interface
### The problem
The TUI (`harmony_tui`) exists with ratatui, crossterm, and tui-logger, but it's underused. The CLI (`harmony_cli`) is the primary interface. During a multi-hour deployment, operators watch scrolling log output with no structure, no ability to drill into a specific Score's progress, and no overview of where they are in the pipeline.
### Why it matters
- Log output during interactive prompts corrupts the terminal
- No way to see "I'm on Stage 3 of 7, 2 hours elapsed, 3 Scores completed successfully"
- No way to inspect a Score's configuration or outcome without reading logs
- The pipeline feels like a black box during execution
### Design direction
The TUI should provide three views:
**Pipeline view** — the default. Shows the ordered list of Scores with their status:
```
OKD HA Cluster Deployment [Stage 3/7 — 1h 42m elapsed]
──────────────────────────────────────────────────────────────────
✅ OKDIpxeScore 2m 14s
✅ OKDSetup01InventoryScore 8m 03s
✅ OKDSetup02BootstrapScore 34m 21s
▶ OKDSetup03ControlPlaneScore ... running
⏳ OKDSetupPersistNetworkBondScore
⏳ OKDSetup04WorkersScore
⏳ OKDSetup06InstallationReportScore
```
**Detail view** — press Enter on a Score to see its Outcome details, sub-score executions, and logs.
**Log view** — the current tui-logger panel, filtered to the selected Score.
The TUI already has the Score widget and log integration. What's missing is the pipeline-level orchestration view and the duration/status data — which the `Score::interpret` timing we just added now provides.
### Immediate enablers
The instrumentation event system (`HarmonyEvent`) already captures start/finish with execution IDs. The TUI subscriber just needs to:
1. Track the ordered list of Scores from the Maestro
2. Update status as `InterpretExecutionStarted`/`Finished` events arrive
3. Render the pipeline view using ratatui
This doesn't require architectural changes — it's a TUI feature built on existing infrastructure.

View File

@@ -156,9 +156,56 @@ impl<T: Topology + K8sclient> Interpret<T> for MyInterpret {
}
```
## Design Principles
### Capabilities are industry concepts, not tools
A capability trait must represent a **standard infrastructure need** that could be fulfilled by multiple tools. The developer who writes a Score should not need to know which product provides the capability.
Good capabilities: `DnsServer`, `LoadBalancer`, `DhcpServer`, `CertificateManagement`, `Router`
These are industry-standard concepts. OPNsense provides `DnsServer` via Unbound; a future topology could provide it via CoreDNS or AWS Route53. The Score doesn't care.
The one exception is when the developer fundamentally needs to know the implementation: `PostgreSQL` is a capability (not `Database`) because the developer writes PostgreSQL-specific SQL, replication configs, and connection strings. Swapping it for MariaDB would break the application, not just the infrastructure.
**Test:** If you could swap the underlying tool without breaking any Score that uses the capability, you've drawn the boundary correctly. If swapping would require rewriting Scores, the capability is too tool-specific.
### One Score per concern, one capability per concern
A Score should express a single infrastructure intent. A capability should expose a single infrastructure concept.
If you're building a deployment that combines multiple concerns (e.g., "deploy Zitadel" requires PostgreSQL + Helm + K8s + Ingress), the Score **declares all of them as trait bounds** and the Topology provides them:
```rust
impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Score<T> for ZitadelScore
```
If you're building a tool that provides multiple capabilities (e.g., OpenBao provides secret storage, KV versioning, JWT auth, policy management), each capability should be a **separate trait** that can be implemented independently. This way, a Score that only needs secret storage doesn't pull in JWT auth machinery.
### Scores encapsulate operational complexity
The value of a Score is turning tribal knowledge into compiled, type-checked infrastructure. The `ZitadelScore` knows that you need to create a namespace, deploy a PostgreSQL cluster via CNPG, wait for the cluster to be ready, create a masterkey secret, generate a secure admin password, detect the K8s distribution, build distribution-specific Helm values, and deploy the chart. A developer using it writes:
```rust
let zitadel = ZitadelScore { host: "sso.example.com".to_string(), ..Default::default() };
```
Move procedural complexity into opinionated Scores. This makes them easy to test against various topologies (k3d, OpenShift, kubeadm, bare metal) and easy to compose in high-level examples.
### Scores must be idempotent
Running a Score twice should produce the same result as running it once. Use create-or-update semantics, check for existing state before acting, and handle "already exists" responses gracefully.
### Scores must not depend on other Scores running first
A Score declares its capability requirements via trait bounds. It does **not** assume that another Score has run before it. If your Score needs PostgreSQL, it declares `T: PostgreSQL` and lets the Topology handle whether PostgreSQL needs to be installed first.
If you find yourself writing "run Score A, then run Score B", consider whether Score B should declare the capability that Score A provides, or whether both should be orchestrated by a higher-level Score that composes them.
## Best Practices
- **Keep Scores focused** — one Score per concern (deployment, monitoring, networking)
- **Use `..Default::default()`** for optional fields so callers only need to specify what they care about
- **Return `Outcome`** — use `Outcome::success`, `Outcome::failure`, or `Outcome::success_with_details` to communicate results clearly
- **Handle errors gracefully** — return meaningful `InterpretError` messages that help operators debug issues
- **Design capabilities around the developer's need** — not around the tool that fulfills it. Ask: "what is the core need that leads a developer to use this tool?"
- **Don't name capabilities after tools** — `SecretVault` not `OpenbaoStore`, `IdentityProvider` not `ZitadelAuth`

View File

@@ -4,9 +4,13 @@ Real-world scenarios demonstrating Harmony in action.
## Available Use Cases
### [OPNsense VM Integration](./opnsense-vm-integration.md)
Boot a real OPNsense firewall in a local KVM VM and configure it entirely through Harmony — load balancer, DHCP, TFTP, VLANs, firewall rules, NAT, VIPs, and link aggregation. Fully automated, zero manual steps. The best way to see Harmony in action.
### [PostgreSQL on Local K3D](./postgresql-on-local-k3d.md)
Deploy a fully functional PostgreSQL cluster on a local K3D cluster in under 10 minutes. The quickest way to see Harmony in action.
Deploy a fully functional PostgreSQL cluster on a local K3D cluster in under 10 minutes. The quickest way to see Harmony's Kubernetes capabilities.
### [OKD on Bare Metal](./okd-on-bare-metal.md)

View File

@@ -0,0 +1,234 @@
# Use Case: OPNsense VM Integration
Boot a real OPNsense firewall in a local KVM virtual machine and configure it entirely through Harmony — load balancer, DHCP, TFTP, VLANs, firewall rules, NAT, VIPs, and link aggregation. Fully automated, zero manual steps, CI-friendly.
This is the best way to discover Harmony: you'll see 11 different Scores configure a production firewall through type-safe Rust code and the OPNsense REST API.
## What you'll have at the end
A local OPNsense VM fully configured by Harmony with:
- HAProxy load balancer with health-checked backends
- DHCP server with static host bindings and PXE boot options
- TFTP server serving boot files
- Prometheus node exporter enabled
- 2 VLANs on the LAN interface
- Firewall filter rules, outbound NAT, and bidirectional NAT
- Virtual IPs (IP aliases)
- Port forwarding (DNAT) rules
- LAGG interface (link aggregation)
All applied idempotently through the OPNsense REST API — the same Scores used in production bare-metal deployments.
## Prerequisites
- **Linux** with KVM support (Intel VT-x/AMD-V enabled in BIOS)
- **libvirt + QEMU** installed and running (`libvirtd` service active)
- **~10 GB** free disk space
- **~15 minutes** for the first run (image download + OPNsense firmware update)
- Docker running (if installed — the setup handles compatibility)
Supported distributions: Arch, Manjaro, Fedora, Ubuntu, Debian.
## Quick start (single command)
```bash
# One-time: install libvirt and configure permissions
./examples/opnsense_vm_integration/setup-libvirt.sh
newgrp libvirt
# Verify
cargo run -p opnsense-vm-integration -- --check
# Boot + bootstrap + run all 11 Scores (fully unattended)
cargo run -p opnsense-vm-integration -- --full
```
That's it. No browser clicks, no manual SSH configuration, no wizard interaction.
## What happens step by step
### Phase 1: Boot the VM
Downloads the OPNsense 26.1 nano image (~350 MB, cached after first run), injects a `config.xml` with virtio NIC assignments, creates a 4 GiB qcow2 disk, and boots the VM with 4 NICs:
```
vtnet0 = LAN (192.168.1.1/24) -- management
vtnet1 = WAN (DHCP) -- internet access
vtnet2 = LAGG member 1 -- for aggregation test
vtnet3 = LAGG member 2 -- for aggregation test
```
### Phase 2: Automated bootstrap
Once the web UI responds (~20 seconds after boot), `OPNsenseBootstrap` takes over:
1. **Logs in** to the web UI (root/opnsense) with automatic CSRF token handling
2. **Aborts the initial setup wizard** via the OPNsense API
3. **Enables SSH** with root login and password authentication
4. **Changes the web GUI port** to 9443 (prevents HAProxy conflicts on standard ports)
5. **Restarts lighttpd** via SSH to apply the port change
No browser, no Playwright, no expect scripts — just HTTP requests with session cookies and SSH commands.
### Phase 3: Run 11 Scores
Creates an API key via SSH, then configures the entire firewall:
| # | Score | What it configures |
|---|-------|--------------------|
| 1 | `LoadBalancerScore` | HAProxy with 2 frontends (ports 16443 and 18443), backends with health checks |
| 2 | `DhcpScore` | DHCP range, 2 static host bindings (MAC-to-IP), PXE boot options |
| 3 | `TftpScore` | TFTP server serving PXE boot files |
| 4 | `NodeExporterScore` | Prometheus node exporter on OPNsense |
| 5 | `VlanScore` | 2 test VLANs (tags 100 and 200) on vtnet0 |
| 6 | `FirewallRuleScore` | Firewall filter rules (allow/block with logging) |
| 7 | `OutboundNatScore` | Source NAT rule for outbound traffic |
| 8 | `BinatScore` | Bidirectional 1:1 NAT |
| 9 | `VipScore` | Virtual IPs (IP aliases for CARP/HA) |
| 10 | `DnatScore` | Port forwarding rules |
| 11 | `LaggScore` | Link aggregation group (failover on vtnet2+vtnet3) |
Each Score reports its status:
```
[LoadBalancerScore] SUCCESS in 2.2s -- Load balancer configured 2 services
[DhcpScore] SUCCESS in 1.4s -- Dhcp Interpret execution successful
[VlanScore] SUCCESS in 0.2s -- Configured 2 VLANs
...
PASSED -- All OPNsense integration tests successful
```
### Phase 4: Verify
After all Scores run, the integration test verifies each configuration via the REST API:
- HAProxy has 2+ frontends
- Dnsmasq has 2+ static hosts and a DHCP range
- TFTP is enabled
- Node exporter is enabled
- 2+ VLANs exist
- Firewall filter rules are present
- VIPs, DNAT, BINAT, SNAT rules are configured
- LAGG interface exists
## Explore in the web UI
After the test completes, open https://192.168.1.1:9443 (login: root/opnsense) and explore:
- **Services > HAProxy > Settings** -- frontends, backends, servers with health checks
- **Services > Dnsmasq DNS > Settings** -- host overrides (static DHCP entries)
- **Services > TFTP** -- enabled with uploaded files
- **Interfaces > Other Types > VLAN** -- two tagged VLANs
- **Firewall > Automation > Filter** -- filter rules created by Harmony
- **Firewall > NAT > Port Forward** -- DNAT rules
- **Firewall > NAT > Outbound** -- SNAT rules
- **Firewall > NAT > One-to-One** -- BINAT rules
- **Interfaces > Virtual IPs > Settings** -- IP aliases
- **Interfaces > Other Types > LAGG** -- link aggregation group
## Clean up
```bash
cargo run -p opnsense-vm-integration -- --clean
```
Destroys the VM and virtual networks. The cached OPNsense image is kept for next time.
## How it works
### Architecture
```
Your workstation OPNsense VM (KVM)
+--------------------+ +---------------------+
| Harmony | | OPNsense 26.1 |
| +---------------+ | REST API | +---------------+ |
| | OPNsense |----(HTTPS:9443)---->| | API + Plugins | |
| | Scores | | | +---------------+ |
| +---------------+ | SSH | +---------------+ |
| +---------------+ |----(port 22)----->| | FreeBSD Shell | |
| | OPNsense- | | | +---------------+ |
| | Bootstrap | | HTTP session | |
| +---------------+ |----(HTTPS:443)--->| (first-boot only) |
| +---------------+ | | |
| | opnsense- | | | LAN: 192.168.1.1 |
| | config | | | WAN: DHCP |
| +---------------+ | +---------------------+
+--------------------+
```
The stack has four layers:
1. **`opnsense-api`** -- auto-generated typed Rust client from OPNsense XML model files
2. **`opnsense-config`** -- high-level configuration modules (DHCP, firewall, load balancer, etc.)
3. **`OPNsenseBootstrap`** -- first-boot automation via HTTP session auth (login, wizard, SSH, webgui port)
4. **Harmony Scores** -- declarative desired-state descriptions that make the firewall match
### The Score pattern
```rust
// 1. Declare desired state
let score = VlanScore {
vlans: vec![
VlanDef { parent: "vtnet0", tag: 100, description: "management" },
VlanDef { parent: "vtnet0", tag: 200, description: "storage" },
],
};
// 2. Execute against topology -- queries current state, applies diff
score.interpret(&inventory, &topology).await?;
// Output: [VlanScore] SUCCESS in 0.9s -- Created 2 VLANs
```
Scores are idempotent: running the same Score twice produces the same result.
## Network architecture
```
Host (192.168.1.10) --- virbr-opn bridge --- OPNsense LAN (192.168.1.1)
192.168.1.0/24 vtnet0
NAT to internet
--- virbr0 (default) --- OPNsense WAN (DHCP)
192.168.122.0/24 vtnet1
NAT to internet
```
## Available commands
| Command | Description |
|---------|-------------|
| `--check` | Verify prerequisites (libvirtd, virsh, qemu-img) |
| `--download` | Download the OPNsense image (cached) |
| `--boot` | Create VM + automated bootstrap |
| (default) | Run integration test (assumes VM is bootstrapped) |
| `--full` | Boot + bootstrap + integration test (CI mode) |
| `--status` | Show VM state, ports, and connectivity |
| `--clean` | Destroy VM and networks |
## Environment variables
| Variable | Default | Description |
|----------|---------|-------------|
| `RUST_LOG` | (unset) | Log level: `info`, `debug`, `trace` |
| `HARMONY_KVM_URI` | `qemu:///system` | Libvirt connection URI |
| `HARMONY_KVM_IMAGE_DIR` | `~/.local/share/harmony/kvm/images` | Cached disk images |
## Troubleshooting
**VM won't start / permission denied**
Ensure your user is in the `libvirt` group and that the image directory is traversable by the qemu user. Run `setup-libvirt.sh` to fix.
**192.168.1.0/24 conflict**
If your host network already uses this subnet, the VM will be unreachable. Edit the constants in `src/main.rs` to use a different subnet.
**Web GUI didn't come up after bootstrap**
The bootstrap runs `diagnose_via_ssh()` automatically when the web UI doesn't respond. Check the diagnostic output for lighttpd status and listening ports. You can also access the serial console: `virsh -c qemu:///system console opn-integration`
**HAProxy install fails**
OPNsense may need a firmware update. The integration test handles this automatically but it may take a few minutes for the update + reboot cycle.
## What's next
- **[OPNsense Firewall Pair](../../examples/opnsense_pair_integration/README.md)** -- boot two VMs, configure CARP HA failover with `FirewallPairTopology` and `CarpVipScore`. Uses NIC link control to bootstrap both VMs sequentially despite sharing the same default IP.
- [OKD on Bare Metal](./okd-on-bare-metal.md) -- the full 7-stage OKD installation pipeline using OPNsense as the infrastructure backbone
- [PostgreSQL on Local K3D](./postgresql-on-local-k3d.md) -- a simpler starting point using Kubernetes

View File

@@ -18,6 +18,8 @@ This directory contains runnable examples demonstrating Harmony's capabilities.
| `remove_rook_osd` | Remove a Rook OSD | — | ✅ | Rook/Ceph |
| `brocade_snmp_server` | Configure Brocade switch SNMP | — | ✅ | Brocade switch |
| `opnsense_node_exporter` | Node exporter on OPNsense | — | ✅ | OPNsense firewall |
| `opnsense_vm_integration` | Full OPNsense firewall automation (11 Scores) | ✅ | — | KVM/libvirt |
| `opnsense_pair_integration` | OPNsense HA pair with CARP failover | ✅ | — | KVM/libvirt |
| `okd_pxe` | PXE boot configuration for OKD | — | — | ✅ |
| `okd_installation` | Full OKD bare-metal install | — | — | ✅ |
| `okd_cluster_alerts` | OKD cluster monitoring alerts | — | ✅ | OKD cluster |
@@ -75,6 +77,8 @@ This directory contains runnable examples demonstrating Harmony's capabilities.
- **`application_monitoring_with_tenant`** — App monitoring with tenant isolation
### Infrastructure & Bare Metal
- **`opnsense_vm_integration`** — **Recommended demo.** Boot an OPNsense VM and configure it with 11 Scores (load balancer, DHCP, TFTP, VLANs, firewall rules, NAT, VIPs, LAGG). Fully automated, requires only KVM. See the [detailed guide](../docs/use-cases/opnsense-vm-integration.md).
- **`opnsense_pair_integration`** — Boot two OPNsense VMs and configure a CARP HA firewall pair with `FirewallPairTopology` and `CarpVipScore`. Demonstrates NIC link control for sequential bootstrap.
- **`okd_installation`** — Full OKD cluster from scratch
- **`okd_pxe`** — PXE boot configuration for OKD
- **`sttest`** — Full OKD stack test with specific hardware

View File

@@ -0,0 +1,30 @@
[package]
name = "example-harmony-sso"
edition = "2024"
version.workspace = true
readme.workspace = true
license.workspace = true
[dependencies]
harmony = { path = "../../harmony" }
harmony_cli = { path = "../../harmony_cli" }
harmony_config = { path = "../../harmony_config" }
harmony_macros = { path = "../../harmony_macros" }
harmony_secret = { path = "../../harmony_secret" }
harmony_types = { path = "../../harmony_types" }
harmony-k8s = { path = "../../harmony-k8s" }
k3d-rs = { path = "../../k3d" }
k8s-openapi.workspace = true
kube.workspace = true
tokio.workspace = true
url.workspace = true
log.workspace = true
env_logger.workspace = true
serde.workspace = true
serde_json.workspace = true
anyhow.workspace = true
reqwest.workspace = true
clap = { version = "4", features = ["derive"] }
schemars = "0.8"
interactive-parse = "0.1.5"
directories = "6.0.0"

View File

@@ -0,0 +1,90 @@
# Harmony SSO Example
Deploys Zitadel (identity provider) and OpenBao (secrets management) on a local k3d cluster, then demonstrates using them as `harmony_config` backends for shared config and secret management.
## Prerequisites
- Docker running
- Ports 8080 and 8200 free
- `/etc/hosts` entries (or use a local DNS resolver):
```
127.0.0.1 sso.harmony.local
127.0.0.1 bao.harmony.local
```
## Usage
### Full deployment
```bash
# Deploy everything (OpenBao + Zitadel)
cargo run -p example-harmony-sso
# OpenBao only (faster, skip Zitadel)
cargo run -p example-harmony-sso -- --skip-zitadel
```
### Config storage demo (token auth)
After deployment, run the config demo to verify `harmony_config` works with OpenBao:
```bash
cargo run -p example-harmony-sso -- --demo
```
This writes and reads a `SsoExampleConfig` through the `ConfigManager` chain (`EnvSource -> StoreSource<OpenbaoSecretStore>`), demonstrating environment variable overrides and persistent storage in OpenBao KV v2.
### SSO device flow demo
Requires a Zitadel application configured for device code grant:
```bash
HARMONY_SSO_CLIENT_ID=<zitadel-app-client-id> \
cargo run -p example-harmony-sso -- --sso-demo
```
### Cleanup
```bash
cargo run -p example-harmony-sso -- --cleanup
```
## What gets deployed
| Component | Namespace | Access |
|---|---|---|
| OpenBao (standalone, file storage) | `openbao` | `http://bao.harmony.local:8200` |
| Zitadel (with CNPG PostgreSQL) | `zitadel` | `http://sso.harmony.local:8080` |
### OpenBao configuration
- **Auth methods:** userpass, JWT
- **Secrets engine:** KV v2 at `secret/`
- **Policy:** `harmony-dev` grants CRUD on `secret/data/harmony/*`
- **Userpass credentials:** `harmony` / `harmony-dev-password`
- **JWT auth:** configured with Zitadel as OIDC provider, role `harmony-developer`
- **Unseal keys:** saved to `~/.local/share/harmony/openbao/unseal-keys.json`
## Architecture
```
Developer CLI
|
|-- harmony_config::ConfigManager
| |-- EnvSource (HARMONY_CONFIG_* env vars)
| |-- StoreSource<OpenbaoSecretStore>
| |-- Token auth (OPENBAO_TOKEN)
| |-- Cached token validation
| |-- Zitadel OIDC device flow (RFC 8628)
| |-- Userpass fallback
|
v
k3d cluster (harmony-example)
|-- OpenBao (KV v2 secrets engine)
| |-- JWT auth -> validates Zitadel id_tokens
| |-- userpass auth -> dev credentials
|
|-- Zitadel (OpenID Connect IdP)
|-- Device authorization grant
|-- Federated login (Google, GitHub, Entra ID)
```

View File

@@ -0,0 +1,155 @@
# Harmony SSO Plan
## Context
Deploy Zitadel and OpenBao on a local k3d cluster, use them as `harmony_config` backends, and demonstrate end-to-end config storage authenticated via SSO. The goal: rock-solid deployment so teams and collaborators can reliably share config and secrets through OpenBao with Zitadel SSO authentication.
## Status
### Phase A: MVP with Token Auth -- DONE
- [x] A.1 -- CLI argument parsing (`--demo`, `--sso-demo`, `--skip-zitadel`, `--cleanup`)
- [x] A.2 -- Zitadel deployment via `ZitadelScore` (`external_secure: false` for k3d)
- [x] A.3 -- OpenBao JWT auth method + `harmony-dev` policy configuration
- [x] A.4 -- `--demo` flag: config storage demo with token auth via `ConfigManager`
- [x] A.5 -- Hardening: retry loops for pod readiness, HTTP readiness checks, `--cleanup`
- [x] A.6 -- README with prerequisites, usage, and architecture
Verified end-to-end: fresh `k3d cluster delete` -> `cargo run -p example-harmony-sso` -> `--demo` succeeds.
### Phase B: OIDC Device Flow + JWT Exchange -- TODO
The Zitadel OIDC device flow code exists (`harmony_secret/src/store/zitadel.rs`) but the **JWT exchange** step is missing: `process_token_response()` stores the OIDC `access_token` as `openbao_token` directly, but per ADR 020-1 the `id_token` should be exchanged with OpenBao's `/v1/auth/jwt/login` endpoint.
**B.1 -- Implement JWT exchange in `harmony_secret/src/store/zitadel.rs`:**
- Add `openbao_url`, `jwt_auth_mount`, `jwt_role` fields to `ZitadelOidcAuth`
- Add `exchange_jwt_for_openbao_token(id_token)` using raw `reqwest` (vaultrs 0.7.4 has no JWT auth module)
- POST `{openbao_url}/v1/auth/{jwt_auth_mount}/login` with `{"role": "...", "jwt": "..."}`
- Modify `process_token_response()` to use exchange when `openbao_url` is set
**B.2 -- Wire JWT params through `harmony_secret/src/store/openbao.rs`:**
- Pass `base_url`, `jwt_auth_mount`, `jwt_role` to `ZitadelOidcAuth::new()` in `authenticate_zitadel_oidc()`
- Update `OpenbaoSecretStore::new()` signature for optional `jwt_role` and `jwt_auth_mount`
**B.3 -- Add env vars to `harmony_secret/src/config.rs`:**
- `OPENBAO_JWT_AUTH_MOUNT` (default: `jwt`)
- `OPENBAO_JWT_ROLE` (default: `harmony-developer`)
**B.4 -- Silent refresh:**
- Add `refresh_token()` method to `ZitadelOidcAuth`
- Update auth chain in `openbao.rs`: cached session -> silent refresh -> device flow
**B.5 -- `--sso-demo` flag:**
- Already stubbed in `examples/harmony_sso/src/main.rs`
- Requires a Zitadel device code application (manual setup, accept `HARMONY_SSO_CLIENT_ID` env var)
**B.6 -- Solve in-cluster DNS for JWT auth config:**
- OpenBao JWT auth needs `oidc_discovery_url` to fetch Zitadel's JWKS
- Zitadel requires `Host` header matching `ExternalDomain` on ALL endpoints (including `/oauth/v2/keys`)
- So `oidc_discovery_url=http://zitadel.zitadel.svc.cluster.local:8080` gets 404 from Zitadel
- Options: (a) CoreDNS rewrite rule mapping `sso.harmony.local` -> `zitadel.zitadel.svc`, (b) Kubernetes ExternalName service, (c) `Zitadel.AdditionalDomains` Helm config to accept the internal hostname
- Currently non-fatal (warning only), needed before `--sso-demo` can work
### Phase C: Testing & Automation -- TODO
**C.1 -- Integration tests** (`examples/harmony_sso/tests/integration.rs`, `#[ignore]`):
- `test_openbao_health` -- health endpoint
- `test_zitadel_openid_config` -- OIDC discovery
- `test_openbao_userpass_auth` -- write/read secret
- `test_config_manager_openbao_backend` -- full ConfigManager chain
- `test_openbao_jwt_auth_configured` -- verify JWT auth method + role exist
**C.2 -- Zitadel application automation** (`examples/harmony_sso/src/zitadel_setup.rs`):
- Automate project + device code app creation via Zitadel Management API
- Extract and save `client_id`
---
## Tricky Things / Lessons Learned
### ZitadelScore on k3d -- security context
The Zitadel container image (`ghcr.io/zitadel/zitadel`) defines `User: "zitadel"` (non-numeric string). With `runAsNonRoot: true` and `runAsUser: null`, kubelet can't verify the user is non-root and fails with `CreateContainerConfigError`. **Fix:** set `runAsUser: 1000` explicitly (that's the UID for `zitadel` in `/etc/passwd`). This applies to all security contexts: `podSecurityContext`, `securityContext`, `initJob`, `setupJob`, and `login`.
Changed in `harmony/src/modules/zitadel/mod.rs` for the `K3sFamily | Default` branch.
### ZitadelScore on k3d -- ingress class
The K3sFamily Helm values had `kubernetes.io/ingress.class: nginx` annotations. k3d ships with traefik, not nginx. The nginx annotation caused traefik to ignore the ingress entirely (404 on all routes). **Fix:** removed the explicit ingress class annotations -- traefik picks up ingresses without an explicit class by default.
Changed in `harmony/src/modules/zitadel/mod.rs` for the `K3sFamily | Default` branch.
### CNPG CRD registration race
After `helm install cloudnative-pg`, the operator deployment becomes ready but the CRD (`clusters.postgresql.cnpg.io`) is not yet registered in the API server's discovery cache. The kube client caches API discovery at init time, so even after the CRD registers, a reused client won't see it. **Fix:** the example creates a **fresh topology** (and therefore fresh kube client) on each retry attempt. Up to 5 retries with 15s delay.
### CNPG PostgreSQL cluster readiness
After the CNPG `Cluster` CR is created, the PostgreSQL pods and the `-rw` service take 15-30s to come up. `ZitadelScore` immediately calls `topology.get_endpoint()` which looks for the `zitadel-pg-rw` service. If the service doesn't exist yet, it fails with "not found for cluster". **Fix:** same retry loop catches this error pattern.
### Zitadel Helm init job timing
The Zitadel Helm chart runs a `zitadel-init` pre-install/pre-upgrade Job that connects to PostgreSQL. If the PG cluster isn't fully ready (primary not accepting connections), the init job hangs until Helm's 5-minute timeout. On a cold start from scratch, the sequence is: CNPG operator install -> CRD registration (5-15s) -> PG cluster creation -> PG pod scheduling + init (~30s) -> PG primary ready -> Zitadel init job can connect. The retry loop handles this by allowing the full sequence to settle between attempts.
### Zitadel Host header validation
Zitadel validates the `Host` header on **all** HTTP endpoints against its `ExternalDomain` config (`sso.harmony.local`). This means:
- The OIDC discovery endpoint (`/.well-known/openid-configuration`) returns 404 if called via the internal service URL without the correct Host header
- The JWKS endpoint (`/oauth/v2/keys`) also requires the correct Host
- OpenBao's JWT auth `oidc_discovery_url` can't use `http://zitadel.zitadel.svc.cluster.local:8080` because Zitadel rejects the Host
- From outside the cluster, use `127.0.0.1:8080` with `Host: sso.harmony.local` header (or add /etc/hosts entry)
- Phase B needs to solve in-cluster DNS resolution for `sso.harmony.local`
### Both services share one port
Both Zitadel and OpenBao are exposed through traefik ingress on port 80 (mapped to host port 8080). Traefik routes by `Host` header: `sso.harmony.local` -> Zitadel, `bao.harmony.local` -> OpenBao. The original plan had separate port mappings (8080 for Zitadel, 8200 for OpenBao) but the 8200 mapping was useless since traefik only listens on 80/443.
For `--demo` mode, the port-forward bypasses traefik and connects directly to the OpenBao service on port 8200 (no Host header needed).
### `run_bao_command` and shell escaping
The `run_bao_command` function runs `kubectl exec ... -- sh -c "export VAULT_TOKEN=xxx && bao ..."`. Two gotchas:
1. Must use `export VAULT_TOKEN=...` (not just `VAULT_TOKEN=...` prefix) because piped commands after `|` don't inherit the prefix env var
2. The policy creation uses `printf '...' | bao policy write harmony-dev -` which needs careful quoting inside the `sh -c` wrapper. Using `run_bao_command_raw()` avoids double-wrapping.
### FIXMEs for future refactoring
The user flagged several areas that should use `harmony-k8s` instead of raw `kubectl`:
- `wait_for_pod_running()` -- harmony-k8s has pod wait functionality
- `init_openbao()`, `unseal_openbao()` -- exec into pods via kubectl
- `get_k3d_binary_path()`, `get_openbao_data_path()` -- leaking implementation details from k3d/openbao crates
- `configure_openbao()` -- future candidate for an OpenBao/Vault capability trait
---
## Files Modified (Phase A)
| File | Change |
|---|---|
| `examples/harmony_sso/Cargo.toml` | Added clap, schemars, interactive-parse |
| `examples/harmony_sso/src/main.rs` | Complete rewrite: CLI args, Zitadel deploy, JWT auth config, demo modes, hardening |
| `examples/harmony_sso/README.md` | New: prerequisites, usage, architecture |
| `harmony/src/modules/zitadel/mod.rs` | Fixed K3s security context (`runAsUser: 1000`), removed nginx ingress annotations |
## Files to Modify (Phase B)
| File | Change |
|---|---|
| `harmony_secret/src/store/zitadel.rs` | JWT exchange, silent refresh |
| `harmony_secret/src/store/openbao.rs` | Wire JWT params, refresh in auth chain |
| `harmony_secret/src/config.rs` | OPENBAO_JWT_AUTH_MOUNT, OPENBAO_JWT_ROLE env vars |
## Verification
**Phase A (verified 2026-03-28):**
- `cargo run -p example-harmony-sso` -> deploys k3d + OpenBao + Zitadel (with retry for CNPG CRD + PG readiness)
- `curl -H "Host: bao.harmony.local" http://127.0.0.1:8080/v1/sys/health` -> OpenBao healthy (initialized, unsealed)
- `curl -H "Host: sso.harmony.local" http://127.0.0.1:8080/.well-known/openid-configuration` -> Zitadel OIDC config with device_authorization_endpoint
- `cargo run -p example-harmony-sso -- --demo` -> writes/reads config via ConfigManager + OpenbaoSecretStore, env override works
**Phase B:**
- `HARMONY_SSO_URL=http://sso.harmony.local HARMONY_SSO_CLIENT_ID=<id> cargo run -p example-harmony-sso -- --sso-demo`
- Device code appears, login in browser, config stored via SSO-authenticated OpenBao token
**Phase C:**
- `cargo test -p example-harmony-sso -- --ignored` -> integration tests pass

View File

@@ -0,0 +1,407 @@
use anyhow::Context;
use clap::Parser;
use harmony::inventory::Inventory;
use harmony::modules::k8s::coredns::{CoreDNSRewrite, CoreDNSRewriteScore};
use harmony::modules::openbao::{
OpenbaoJwtAuth, OpenbaoPolicy, OpenbaoScore, OpenbaoSetupScore, OpenbaoUser,
};
use harmony::modules::zitadel::{
ZitadelAppType, ZitadelApplication, ZitadelClientConfig, ZitadelScore, ZitadelSetupScore,
};
use harmony::score::Score;
use harmony::topology::{K8sclient, Topology};
use harmony_config::{Config, ConfigManager, EnvSource, StoreSource};
use harmony_k8s::K8sClient;
use harmony_secret::OpenbaoSecretStore;
use k3d_rs::{K3d, PortMapping};
use log::info;
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
use std::path::PathBuf;
use std::sync::Arc;
const CLUSTER_NAME: &str = "harmony-example";
const ZITADEL_HOST: &str = "sso.harmony.local";
const OPENBAO_HOST: &str = "bao.harmony.local";
const HTTP_PORT: u32 = 8080;
const OPENBAO_NAMESPACE: &str = "openbao";
const OPENBAO_POD: &str = "openbao-0";
const APP_NAME: &str = "harmony-cli";
const PROJECT_NAME: &str = "harmony";
#[derive(Parser)]
#[command(
name = "harmony-sso",
about = "Deploy Zitadel + OpenBao on k3d, authenticate via SSO, store config"
)]
struct Args {
/// Skip Zitadel deployment (OpenBao only, faster iteration)
#[arg(long)]
skip_zitadel: bool,
/// Delete the k3d cluster and exit
#[arg(long)]
cleanup: bool,
}
// ---------------------------------------------------------------------------
// Config type stored via SSO-authenticated OpenBao
// ---------------------------------------------------------------------------
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, PartialEq)]
struct SsoExampleConfig {
team_name: String,
environment: String,
max_replicas: u16,
}
impl Default for SsoExampleConfig {
fn default() -> Self {
Self {
team_name: "platform-team".to_string(),
environment: "staging".to_string(),
max_replicas: 3,
}
}
}
impl Config for SsoExampleConfig {
const KEY: &'static str = "SsoExampleConfig";
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
fn harmony_data_dir() -> PathBuf {
directories::BaseDirs::new()
.map(|dirs| dirs.data_dir().join("harmony"))
.unwrap_or_else(|| PathBuf::from("/tmp/harmony"))
}
fn create_k3d() -> K3d {
let base_dir = harmony_data_dir().join("k3d");
std::fs::create_dir_all(&base_dir).expect("Failed to create k3d data directory");
K3d::new(base_dir, Some(CLUSTER_NAME.to_string()))
.with_port_mappings(vec![PortMapping::new(HTTP_PORT, 80)])
}
fn create_topology(k3d: &K3d) -> harmony::topology::K8sAnywhereTopology {
let context = k3d
.context_name()
.unwrap_or_else(|| format!("k3d-{}", CLUSTER_NAME));
unsafe {
std::env::set_var("HARMONY_USE_LOCAL_K3D", "false");
std::env::set_var("HARMONY_AUTOINSTALL", "false");
std::env::set_var("HARMONY_K8S_CONTEXT", &context);
}
harmony::topology::K8sAnywhereTopology::from_env()
}
fn harmony_dev_policy() -> OpenbaoPolicy {
OpenbaoPolicy {
name: "harmony-dev".to_string(),
hcl: r#"path "secret/data/harmony/*" { capabilities = ["create","read","update","delete","list"] }
path "secret/metadata/harmony/*" { capabilities = ["list","read"] }"#
.to_string(),
}
}
// ---------------------------------------------------------------------------
// Zitadel deployment (with CNPG retry)
// ---------------------------------------------------------------------------
async fn deploy_zitadel(k3d: &K3d) -> anyhow::Result<()> {
info!("Deploying Zitadel (this may take several minutes)...");
let zitadel = ZitadelScore {
host: ZITADEL_HOST.to_string(),
zitadel_version: "v4.12.1".to_string(),
external_secure: false,
};
let topology = create_topology(k3d);
topology
.ensure_ready()
.await
.context("Topology init failed")?;
zitadel
.interpret(&Inventory::autoload(), &topology)
.await
.context("Zitadel deployment failed")?;
info!("Zitadel deployed successfully");
Ok(())
}
async fn wait_for_zitadel_ready() -> anyhow::Result<()> {
info!("Waiting for Zitadel to be ready...");
let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(5))
.build()?;
for attempt in 1..=90 {
match client
.get(format!(
"http://127.0.0.1:{}/.well-known/openid-configuration",
HTTP_PORT
))
.header("Host", ZITADEL_HOST)
.send()
.await
{
Ok(resp) if resp.status().is_success() => {
info!("Zitadel is ready");
return Ok(());
}
Ok(resp) if attempt % 10 == 0 => {
info!("Zitadel HTTP {}, attempt {}/90", resp.status(), attempt);
}
Err(e) if attempt % 10 == 0 => {
info!("Zitadel not reachable: {}, attempt {}/90", e, attempt);
}
_ => {}
}
tokio::time::sleep(tokio::time::Duration::from_secs(2)).await;
}
anyhow::bail!("Timed out waiting for Zitadel")
}
// ---------------------------------------------------------------------------
// Cluster lifecycle
// ---------------------------------------------------------------------------
async fn ensure_k3d_cluster(k3d: &K3d) -> anyhow::Result<()> {
info!("Ensuring k3d cluster '{}' is running...", CLUSTER_NAME);
k3d.ensure_installed()
.await
.map_err(|e| anyhow::anyhow!("k3d setup failed: {}", e))?;
info!("k3d cluster '{}' is ready", CLUSTER_NAME);
Ok(())
}
fn cleanup_cluster(k3d: &K3d) -> anyhow::Result<()> {
let name = k3d
.cluster_name()
.ok_or_else(|| anyhow::anyhow!("No cluster name"))?;
info!("Deleting k3d cluster '{}'...", name);
k3d.run_k3d_command(["cluster", "delete", name])
.map_err(|e| anyhow::anyhow!("{}", e))?;
info!("Cluster '{}' deleted", name);
Ok(())
}
async fn cleanup_openbao_webhook(k8s: &K8sClient) -> anyhow::Result<()> {
use k8s_openapi::api::admissionregistration::v1::MutatingWebhookConfiguration;
if k8s
.get_resource::<MutatingWebhookConfiguration>("openbao-agent-injector-cfg", None)
.await?
.is_some()
{
info!("Deleting conflicting OpenBao webhook...");
k8s.delete_resource::<MutatingWebhookConfiguration>("openbao-agent-injector-cfg", None)
.await?;
}
Ok(())
}
// ---------------------------------------------------------------------------
// Main
// ---------------------------------------------------------------------------
#[tokio::main]
async fn main() -> anyhow::Result<()> {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
let args = Args::parse();
let k3d = create_k3d();
if args.cleanup {
return cleanup_cluster(&k3d);
}
info!("===========================================");
info!("Harmony SSO Example");
info!("===========================================");
// --- Phase 1: Infrastructure ---
ensure_k3d_cluster(&k3d).await?;
let topology = create_topology(&k3d);
topology
.ensure_ready()
.await
.context("Topology init failed")?;
let k8s = topology
.k8s_client()
.await
.map_err(|e| anyhow::anyhow!("K8s client: {}", e))?;
// Deploy + configure OpenBao (no JWT auth yet -- Zitadel isn't up)
cleanup_openbao_webhook(&k8s).await?;
OpenbaoScore {
host: OPENBAO_HOST.to_string(),
openshift: false,
}
.interpret(&Inventory::autoload(), &topology)
.await
.context("OpenBao deploy failed")?;
OpenbaoSetupScore {
policies: vec![harmony_dev_policy()],
users: vec![OpenbaoUser {
username: "harmony".to_string(),
password: "harmony-dev-password".to_string(),
policies: vec!["harmony-dev".to_string()],
}],
jwt_auth: None, // Phase 2 adds JWT after Zitadel is ready
..Default::default()
}
.interpret(&Inventory::autoload(), &topology)
.await
.context("OpenBao setup failed")?;
if args.skip_zitadel {
info!("=== Skipping Zitadel (--skip-zitadel) ===");
info!("OpenBao: http://{}:{}", OPENBAO_HOST, HTTP_PORT);
return Ok(());
}
// --- Phase 2: Identity + SSO Wiring ---
CoreDNSRewriteScore {
rewrites: vec![
CoreDNSRewrite {
hostname: ZITADEL_HOST.to_string(),
target: "zitadel.zitadel.svc.cluster.local".to_string(),
},
CoreDNSRewrite {
hostname: OPENBAO_HOST.to_string(),
target: "openbao.openbao.svc.cluster.local".to_string(),
},
],
}
.interpret(&Inventory::autoload(), &topology)
.await
.context("CoreDNS rewrite failed")?;
deploy_zitadel(&k3d).await?;
wait_for_zitadel_ready().await?;
// Provision Zitadel project + device-code application
ZitadelSetupScore {
host: ZITADEL_HOST.to_string(),
port: HTTP_PORT as u16,
skip_tls: true,
applications: vec![ZitadelApplication {
project_name: PROJECT_NAME.to_string(),
app_name: APP_NAME.to_string(),
app_type: ZitadelAppType::DeviceCode,
}],
machine_users: vec![],
}
.interpret(&Inventory::autoload(), &topology)
.await
.context("Zitadel setup failed")?;
// Read the client_id from the cache written by ZitadelSetupScore
let zitadel_config =
ZitadelClientConfig::load().context("ZitadelSetupScore did not produce a client config")?;
let client_id = zitadel_config
.client_id(APP_NAME)
.context("No client_id for harmony-cli app")?
.clone();
info!("Zitadel app '{}' client_id: {}", APP_NAME, client_id);
// Now configure OpenBao JWT auth with the real client_id
OpenbaoSetupScore {
policies: vec![harmony_dev_policy()],
users: vec![OpenbaoUser {
username: "harmony".to_string(),
password: "harmony-dev-password".to_string(),
policies: vec!["harmony-dev".to_string()],
}],
jwt_auth: Some(OpenbaoJwtAuth {
oidc_discovery_url: format!("http://{}:{}", ZITADEL_HOST, HTTP_PORT),
bound_issuer: format!("http://{}:{}", ZITADEL_HOST, HTTP_PORT),
role_name: "harmony-developer".to_string(),
bound_audiences: client_id.clone(),
user_claim: "email".to_string(),
policies: vec!["harmony-dev".to_string()],
ttl: "4h".to_string(),
max_ttl: "24h".to_string(),
}),
..Default::default()
}
.interpret(&Inventory::autoload(), &topology)
.await
.context("OpenBao JWT auth setup failed")?;
// --- Phase 3: Config via SSO ---
info!("===========================================");
info!("Storing config via SSO-authenticated OpenBao");
info!("===========================================");
let _pf = k8s
.port_forward(OPENBAO_POD, OPENBAO_NAMESPACE, 8200, 8200)
.await
.context("Port-forward to OpenBao failed")?;
tokio::time::sleep(tokio::time::Duration::from_secs(1)).await;
let openbao_url = format!("http://127.0.0.1:{}", _pf.port());
let sso_url = format!("http://{}:{}", ZITADEL_HOST, HTTP_PORT);
let store = OpenbaoSecretStore::new(
openbao_url,
"secret".to_string(),
"jwt".to_string(),
true,
None,
None,
None,
Some(sso_url),
Some(client_id),
Some("harmony-developer".to_string()),
Some("jwt".to_string()),
)
.await
.context("SSO authentication failed")?;
let manager = ConfigManager::new(vec![
Arc::new(EnvSource) as Arc<dyn harmony_config::ConfigSource>,
Arc::new(StoreSource::new("harmony".to_string(), store)),
]);
// Try to load existing config (succeeds on re-run)
match manager.get::<SsoExampleConfig>().await {
Ok(config) => {
info!("Config loaded from OpenBao: {:?}", config);
}
Err(harmony_config::ConfigError::NotFound { .. }) => {
info!("No config found, storing default...");
let config = SsoExampleConfig::default();
manager.set(&config).await?;
info!("Config stored: {:?}", config);
let retrieved: SsoExampleConfig = manager.get().await?;
info!("Config verified: {:?}", retrieved);
assert_eq!(config, retrieved);
}
Err(e) => return Err(e.into()),
}
info!("===========================================");
info!("Success! Config managed via Zitadel SSO + OpenBao");
info!("===========================================");
info!("OpenBao: http://{}:{}", OPENBAO_HOST, HTTP_PORT);
info!("Zitadel: http://{}:{}", ZITADEL_HOST, HTTP_PORT);
info!("Run again to verify cached session works.");
info!("cargo run -p example-harmony-sso -- --cleanup # teardown");
Ok(())
}

View File

@@ -0,0 +1,16 @@
[package]
name = "kvm-vm-examples"
version.workspace = true
edition = "2024"
license.workspace = true
[[bin]]
name = "kvm-vm-examples"
path = "src/main.rs"
[dependencies]
harmony = { path = "../../harmony" }
tokio.workspace = true
log.workspace = true
env_logger.workspace = true
clap = { version = "4", features = ["derive"] }

View File

@@ -0,0 +1,47 @@
# KVM VM Examples
Demonstrates creating VMs with various configurations using harmony's KVM module. These examples exercise the same infrastructure primitives needed for the full OKD HA cluster with OPNsense, control plane, and workers with Ceph.
## Prerequisites
A working KVM/libvirt setup:
```bash
# Manjaro / Arch
sudo pacman -S qemu-full libvirt virt-install dnsmasq ebtables
sudo systemctl enable --now libvirtd
sudo usermod -aG libvirt $USER
# Log out and back in for group membership to take effect
```
## Scenarios
| Scenario | VMs | Disks | NICs | Purpose |
|----------|-----|-------|------|---------|
| `alpine` | 1 | 1x2G | 1 | Minimal VM, fast boot (~5s) |
| `ubuntu` | 1 | 1x25G | 1 | Standard server setup |
| `worker` | 1 | 3 (60G+100G+100G) | 1 | Multi-disk for Ceph OSD |
| `gateway` | 1 | 1x10G | 2 (WAN+LAN) | Dual-NIC firewall |
| `ha-cluster` | 7 | mixed | 1 each | Full HA: gateway + 3 CP + 3 workers |
## Usage
```bash
# Deploy a scenario
cargo run -p kvm-vm-examples -- alpine
cargo run -p kvm-vm-examples -- ubuntu
cargo run -p kvm-vm-examples -- worker
cargo run -p kvm-vm-examples -- gateway
cargo run -p kvm-vm-examples -- ha-cluster
# Check status
cargo run -p kvm-vm-examples -- status alpine
# Clean up
cargo run -p kvm-vm-examples -- clean alpine
```
## Environment variables
- `HARMONY_KVM_URI`: libvirt URI (default: `qemu:///system`)
- `HARMONY_KVM_IMAGE_DIR`: where disk images and ISOs are stored

View File

@@ -0,0 +1,358 @@
//! KVM VM examples demonstrating various configurations.
//!
//! Each subcommand creates a different VM setup. All VMs are managed
//! via libvirt — you need a working KVM hypervisor on the host.
//!
//! # Prerequisites
//!
//! ```bash
//! # Manjaro / Arch
//! sudo pacman -S qemu-full libvirt virt-install dnsmasq ebtables
//! sudo systemctl enable --now libvirtd
//! sudo usermod -aG libvirt $USER
//! ```
//!
//! # Environment variables
//!
//! - `HARMONY_KVM_URI`: libvirt URI (default: `qemu:///system`)
//! - `HARMONY_KVM_IMAGE_DIR`: disk image directory (default: `~/.local/share/harmony/kvm/images`)
//!
//! # Usage
//!
//! ```bash
//! # Simple Alpine VM (tiny, boots in seconds — great for testing)
//! cargo run -p kvm-vm-examples -- alpine
//!
//! # Ubuntu Server with cloud-init
//! cargo run -p kvm-vm-examples -- ubuntu
//!
//! # Multi-disk worker node (Ceph OSD style)
//! cargo run -p kvm-vm-examples -- worker
//!
//! # Multi-NIC gateway (OPNsense style: WAN + LAN)
//! cargo run -p kvm-vm-examples -- gateway
//!
//! # Full HA cluster: 1 gateway + 3 control plane + 3 workers
//! cargo run -p kvm-vm-examples -- ha-cluster
//!
//! # Clean up all VMs and networks from a scenario
//! cargo run -p kvm-vm-examples -- clean <scenario>
//! ```
use clap::{Parser, Subcommand};
use harmony::modules::kvm::config::init_executor;
use harmony::modules::kvm::{
BootDevice, ForwardMode, KvmExecutor, NetworkConfig, NetworkRef, VmConfig, VmStatus,
};
use log::info;
#[derive(Parser)]
#[command(name = "kvm-vm-examples")]
#[command(about = "KVM VM examples for various infrastructure setups")]
struct Cli {
#[command(subcommand)]
command: Commands,
}
#[derive(Subcommand)]
enum Commands {
/// Minimal Alpine Linux VM — fast boot, ~150MB ISO
Alpine,
/// Ubuntu Server 24.04 — standard server with 1 disk
Ubuntu,
/// Worker node with multiple disks (OS + Ceph OSD storage)
Worker,
/// Gateway/firewall with 2 NICs (WAN + LAN)
Gateway,
/// Full HA cluster: gateway + 3 control plane + 3 worker nodes
HaCluster,
/// Tear down all VMs and networks for a scenario
Clean {
/// Scenario to clean: alpine, ubuntu, worker, gateway, ha-cluster
scenario: String,
},
/// Show status of all VMs in a scenario
Status {
/// Scenario: alpine, ubuntu, worker, gateway, ha-cluster
scenario: String,
},
}
const ALPINE_ISO: &str =
"https://dl-cdn.alpinelinux.org/alpine/v3.21/releases/x86_64/alpine-virt-3.21.3-x86_64.iso";
const UBUNTU_ISO: &str = "https://releases.ubuntu.com/24.04.2/ubuntu-24.04.2-live-server-amd64.iso";
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
let cli = Cli::parse();
let executor = init_executor()?;
match cli.command {
Commands::Alpine => deploy_alpine(&executor).await?,
Commands::Ubuntu => deploy_ubuntu(&executor).await?,
Commands::Worker => deploy_worker(&executor).await?,
Commands::Gateway => deploy_gateway(&executor).await?,
Commands::HaCluster => deploy_ha_cluster(&executor).await?,
Commands::Clean { scenario } => clean(&executor, &scenario).await?,
Commands::Status { scenario } => status(&executor, &scenario).await?,
}
Ok(())
}
// ── Alpine: minimal VM ──────────────────────────────────────────────────
async fn deploy_alpine(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
let net = NetworkConfig::builder("alpine-net")
.subnet("192.168.110.1", 24)
.forward(ForwardMode::Nat)
.build();
executor.ensure_network(net).await?;
let vm = VmConfig::builder("alpine-vm")
.vcpus(1)
.memory_mib(512)
.disk(2)
.network(NetworkRef::named("alpine-net"))
.cdrom(ALPINE_ISO)
.boot_order([BootDevice::Cdrom, BootDevice::Disk])
.build();
executor.ensure_vm(vm.clone()).await?;
executor.start_vm(&vm.name).await?;
info!("Alpine VM running. Connect: virsh console {}", vm.name);
info!("Login: root (no password). Install: setup-alpine");
Ok(())
}
// ── Ubuntu Server: standard setup ───────────────────────────────────────
async fn deploy_ubuntu(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
let net = NetworkConfig::builder("ubuntu-net")
.subnet("192.168.120.1", 24)
.forward(ForwardMode::Nat)
.build();
executor.ensure_network(net).await?;
let vm = VmConfig::builder("ubuntu-server")
.vcpus(2)
.memory_gb(4)
.disk(25)
.network(NetworkRef::named("ubuntu-net"))
.cdrom(UBUNTU_ISO)
.boot_order([BootDevice::Cdrom, BootDevice::Disk])
.build();
executor.ensure_vm(vm.clone()).await?;
executor.start_vm(&vm.name).await?;
info!(
"Ubuntu Server VM running. Connect: virsh console {}",
vm.name
);
info!("Follow the interactive installer to complete setup.");
Ok(())
}
// ── Worker: multi-disk for Ceph ─────────────────────────────────────────
async fn deploy_worker(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
let net = NetworkConfig::builder("worker-net")
.subnet("192.168.130.1", 24)
.forward(ForwardMode::Nat)
.build();
executor.ensure_network(net).await?;
let vm = VmConfig::builder("worker-node")
.vcpus(4)
.memory_gb(8)
.disk(60) // vda: OS
.disk(100) // vdb: Ceph OSD 1
.disk(100) // vdc: Ceph OSD 2
.network(NetworkRef::named("worker-net"))
.cdrom(ALPINE_ISO) // Use Alpine for fast testing
.boot_order([BootDevice::Cdrom, BootDevice::Disk])
.build();
executor.ensure_vm(vm.clone()).await?;
executor.start_vm(&vm.name).await?;
info!("Worker node running with 3 disks (vda=60G OS, vdb=100G OSD, vdc=100G OSD)");
info!("Connect: virsh console {}", vm.name);
Ok(())
}
// ── Gateway: dual-NIC firewall ──────────────────────────────────────────
async fn deploy_gateway(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
// WAN: NAT network (internet access)
let wan = NetworkConfig::builder("gw-wan")
.subnet("192.168.140.1", 24)
.forward(ForwardMode::Nat)
.build();
// LAN: isolated network (no internet, internal only)
let lan = NetworkConfig::builder("gw-lan")
.subnet("10.100.0.1", 24)
.isolated()
.build();
executor.ensure_network(wan).await?;
executor.ensure_network(lan).await?;
let vm = VmConfig::builder("gateway-vm")
.vcpus(2)
.memory_gb(2)
.disk(10)
.network(NetworkRef::named("gw-wan")) // First NIC = WAN
.network(NetworkRef::named("gw-lan")) // Second NIC = LAN
.cdrom(ALPINE_ISO)
.boot_order([BootDevice::Cdrom, BootDevice::Disk])
.build();
executor.ensure_vm(vm.clone()).await?;
executor.start_vm(&vm.name).await?;
info!("Gateway VM running with 2 NICs: WAN (gw-wan) + LAN (gw-lan)");
info!("Connect: virsh console {}", vm.name);
Ok(())
}
// ── HA Cluster: full OKD-style deployment ───────────────────────────────
async fn deploy_ha_cluster(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
// Network: NAT for external access, all nodes on the same subnet
let cluster_net = NetworkConfig::builder("ha-cluster")
.bridge("virbr-ha")
.subnet("10.200.0.1", 24)
.forward(ForwardMode::Nat)
.build();
executor.ensure_network(cluster_net).await?;
// Gateway / firewall / load balancer
let gateway = VmConfig::builder("ha-gateway")
.vcpus(2)
.memory_gb(2)
.disk(10)
.network(NetworkRef::named("ha-cluster"))
.boot_order([BootDevice::Network, BootDevice::Disk])
.build();
executor.ensure_vm(gateway.clone()).await?;
info!("Defined: {} (gateway/firewall)", gateway.name);
// Control plane nodes
for i in 1..=3 {
let cp = VmConfig::builder(format!("ha-cp-{i}"))
.vcpus(4)
.memory_gb(16)
.disk(120)
.network(NetworkRef::named("ha-cluster"))
.boot_order([BootDevice::Network, BootDevice::Disk])
.build();
executor.ensure_vm(cp.clone()).await?;
info!("Defined: {} (control plane)", cp.name);
}
// Worker nodes with Ceph storage
for i in 1..=3 {
let worker = VmConfig::builder(format!("ha-worker-{i}"))
.vcpus(8)
.memory_gb(32)
.disk(120) // vda: OS
.disk(200) // vdb: Ceph OSD
.network(NetworkRef::named("ha-cluster"))
.boot_order([BootDevice::Network, BootDevice::Disk])
.build();
executor.ensure_vm(worker.clone()).await?;
info!("Defined: {} (worker + Ceph)", worker.name);
}
info!("HA cluster defined (7 VMs). Start individually or use PXE boot.");
info!(
"To start all: for vm in ha-gateway ha-cp-{{1..3}} ha-worker-{{1..3}}; do virsh start $vm; done"
);
Ok(())
}
// ── Clean up ────────────────────────────────────────────────────────────
async fn clean(executor: &KvmExecutor, scenario: &str) -> Result<(), Box<dyn std::error::Error>> {
let (vms, nets) = match scenario {
"alpine" => (vec!["alpine-vm"], vec!["alpine-net"]),
"ubuntu" => (vec!["ubuntu-server"], vec!["ubuntu-net"]),
"worker" => (vec!["worker-node"], vec!["worker-net"]),
"gateway" => (vec!["gateway-vm"], vec!["gw-wan", "gw-lan"]),
"ha-cluster" => (
vec![
"ha-gateway",
"ha-cp-1",
"ha-cp-2",
"ha-cp-3",
"ha-worker-1",
"ha-worker-2",
"ha-worker-3",
],
vec!["ha-cluster"],
),
other => {
eprintln!("Unknown scenario: {other}");
eprintln!("Available: alpine, ubuntu, worker, gateway, ha-cluster");
std::process::exit(1);
}
};
for vm in &vms {
info!("Cleaning up VM: {vm}");
let _ = executor.destroy_vm(vm).await;
let _ = executor.undefine_vm(vm).await;
}
for net in &nets {
info!("Cleaning up network: {net}");
let _ = executor.delete_network(net).await;
}
info!("Cleanup complete for scenario: {scenario}");
Ok(())
}
// ── Status ──────────────────────────────────────────────────────────────
async fn status(executor: &KvmExecutor, scenario: &str) -> Result<(), Box<dyn std::error::Error>> {
let vms: Vec<&str> = match scenario {
"alpine" => vec!["alpine-vm"],
"ubuntu" => vec!["ubuntu-server"],
"worker" => vec!["worker-node"],
"gateway" => vec!["gateway-vm"],
"ha-cluster" => vec![
"ha-gateway",
"ha-cp-1",
"ha-cp-2",
"ha-cp-3",
"ha-worker-1",
"ha-worker-2",
"ha-worker-3",
],
other => {
eprintln!("Unknown scenario: {other}");
std::process::exit(1);
}
};
println!("{:<20} {}", "VM", "STATUS");
println!("{}", "-".repeat(35));
for vm in &vms {
let status = match executor.vm_status(vm).await {
Ok(s) => format!("{s:?}"),
Err(_) => "not found".to_string(),
};
println!("{:<20} {}", vm, status);
}
Ok(())
}

View File

@@ -1,6 +1,7 @@
use brocade::BrocadeOptions;
use cidr::Ipv4Cidr;
use harmony::{
config::secret::{OPNSenseApiCredentials, OPNSenseFirewallCredentials},
hardware::{Location, SwitchGroup},
infra::{
brocade::{BrocadeSwitchClient, BrocadeSwitchConfig},
@@ -11,20 +12,12 @@ use harmony::{
topology::{HAClusterTopology, LogicalHost, UnmanagedRouter},
};
use harmony_macros::{ip, ipv4};
use harmony_secret::{Secret, SecretManager};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
use harmony_secret::SecretManager;
use std::{
net::IpAddr,
sync::{Arc, OnceLock},
};
#[derive(Secret, Serialize, Deserialize, JsonSchema, Debug, PartialEq)]
struct OPNSenseFirewallConfig {
username: String,
password: String,
}
pub async fn get_topology() -> HAClusterTopology {
let firewall = harmony::topology::LogicalHost {
ip: ip!("192.168.1.1"),
@@ -50,17 +43,16 @@ pub async fn get_topology() -> HAClusterTopology {
let switch_client = Arc::new(switch_client);
let config = SecretManager::get_or_prompt::<OPNSenseFirewallConfig>().await;
let config = config.unwrap();
let ssh_creds = SecretManager::get_or_prompt::<OPNSenseFirewallCredentials>()
.await
.unwrap();
let api_creds = SecretManager::get_or_prompt::<OPNSenseApiCredentials>()
.await
.unwrap();
let opnsense = Arc::new(
harmony::infra::opnsense::OPNSenseFirewall::new(
firewall,
None,
&config.username,
&config.password,
)
.await,
harmony::infra::opnsense::OPNSenseFirewall::new(firewall, None, &api_creds, &ssh_creds)
.await,
);
let lan_subnet = ipv4!("192.168.1.0");
let gateway_ipv4 = ipv4!("192.168.1.1");

View File

@@ -43,17 +43,17 @@ pub async fn get_topology() -> HAClusterTopology {
let switch_client = Arc::new(switch_client);
let config = SecretManager::get_or_prompt::<OPNSenseFirewallCredentials>().await;
let config = config.unwrap();
let ssh_creds = SecretManager::get_or_prompt::<OPNSenseFirewallCredentials>()
.await
.unwrap();
let api_creds =
SecretManager::get_or_prompt::<harmony::config::secret::OPNSenseApiCredentials>()
.await
.unwrap();
let opnsense = Arc::new(
harmony::infra::opnsense::OPNSenseFirewall::new(
firewall,
None,
&config.username,
&config.password,
)
.await,
harmony::infra::opnsense::OPNSenseFirewall::new(firewall, None, &api_creds, &ssh_creds)
.await,
);
let lan_subnet = ipv4!("192.168.1.0");
let gateway_ipv4 = ipv4!("192.168.1.1");

View File

@@ -6,6 +6,7 @@ use harmony::{
async fn main() {
let openbao = OpenbaoScore {
host: "openbao.sebastien.sto1.nationtech.io".to_string(),
openshift: false,
};
harmony_cli::run(

View File

@@ -1,5 +1,5 @@
use harmony::{
config::secret::OPNSenseFirewallCredentials,
config::secret::{OPNSenseApiCredentials, OPNSenseFirewallCredentials},
infra::opnsense::OPNSenseFirewall,
inventory::Inventory,
modules::{dhcp::DhcpScore, opnsense::OPNsenseShellCommandScore},
@@ -17,17 +17,14 @@ async fn main() {
name: String::from("opnsense-1"),
};
let opnsense_auth = SecretManager::get_or_prompt::<OPNSenseFirewallCredentials>()
let ssh_creds = SecretManager::get_or_prompt::<OPNSenseFirewallCredentials>()
.await
.expect("Failed to get credentials");
.expect("Failed to get SSH credentials");
let api_creds = SecretManager::get_or_prompt::<OPNSenseApiCredentials>()
.await
.expect("Failed to get API credentials");
let opnsense = OPNSenseFirewall::new(
firewall,
None,
&opnsense_auth.username,
&opnsense_auth.password,
)
.await;
let opnsense = OPNSenseFirewall::new(firewall, None, &api_creds, &ssh_creds).await;
let dhcp_score = DhcpScore {
dhcp_range: (

View File

@@ -48,8 +48,17 @@ async fn main() {
name: String::from("fw0"),
};
let api_creds = harmony::config::secret::OPNSenseApiCredentials {
key: "root".to_string(),
secret: "opnsense".to_string(),
};
let ssh_creds = harmony::config::secret::OPNSenseFirewallCredentials {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let opnsense = Arc::new(
harmony::infra::opnsense::OPNSenseFirewall::new(firewall, None, "root", "opnsense").await,
harmony::infra::opnsense::OPNSenseFirewall::new(firewall, None, &api_creds, &ssh_creds)
.await,
);
let topology = OpnSenseTopology {

View File

@@ -0,0 +1,25 @@
[package]
name = "opnsense-pair-integration"
version.workspace = true
edition = "2024"
license.workspace = true
[[bin]]
name = "opnsense-pair-integration"
path = "src/main.rs"
[dependencies]
harmony = { path = "../../harmony" }
harmony_cli = { path = "../../harmony_cli" }
harmony_inventory_agent = { path = "../../harmony_inventory_agent" }
harmony_macros = { path = "../../harmony_macros" }
harmony_types = { path = "../../harmony_types" }
opnsense-api = { path = "../../opnsense-api" }
opnsense-config = { path = "../../opnsense-config" }
tokio.workspace = true
log.workspace = true
env_logger.workspace = true
reqwest.workspace = true
russh.workspace = true
serde_json.workspace = true
dirs = "6"

View File

@@ -0,0 +1,64 @@
# OPNsense Firewall Pair Integration Example
Boots two OPNsense VMs, bootstraps both with automated SSH/API setup, then configures a CARP HA firewall pair using `FirewallPairTopology` and `CarpVipScore`. Fully automated, CI-friendly.
## Quick start
```bash
# Prerequisites (same as single-VM example)
./examples/opnsense_vm_integration/setup-libvirt.sh
# Boot + bootstrap + pair test (fully unattended)
cargo run -p opnsense-pair-integration -- --full
```
## What it does
1. Creates a shared LAN network + 2 OPNsense VMs (2 NICs each: LAN + WAN)
2. Bootstraps both VMs sequentially using NIC link control to avoid IP conflicts:
- Disables backup's LAN NIC
- Bootstraps primary on .1 (login, SSH, webgui port 9443)
- Changes primary's LAN IP from .1 to .2
- Swaps NICs (disable primary, enable backup)
- Bootstraps backup on .1
- Changes backup's LAN IP from .1 to .3
- Re-enables all NICs
3. Applies pair scores via `FirewallPairTopology`:
- `CarpVipScore` — CARP VIP at .1 (primary advskew=0, backup advskew=100)
- `VlanScore` — VLAN 100 on both
- `FirewallRuleScore` — ICMP allow on both
4. Verifies CARP VIPs and VLANs via REST API on both firewalls
## Network topology
```
Host (192.168.1.10)
|
+--- virbr-pair (192.168.1.0/24, NAT)
| | |
| fw-primary fw-backup
| vtnet0=.2 vtnet0=.3
| (CARP VIP: .1)
|
+--- virbr0 (default, DHCP)
| |
fw-primary fw-backup
vtnet1=dhcp vtnet1=dhcp (WAN)
```
Both VMs boot with OPNsense's default LAN IP of 192.168.1.1. The NIC juggling sequence ensures only one VM has its LAN NIC active at a time during bootstrap, avoiding address conflicts.
## Requirements
Same as the single-VM example: Linux with KVM, libvirt, ~20 GB disk space, ~20 minutes first run.
## Commands
| Command | Description |
|---------|-------------|
| `--check` | Verify prerequisites |
| `--boot` | Boot + bootstrap both VMs |
| (default) | Run pair integration test |
| `--full` | Boot + bootstrap + test (CI mode) |
| `--status` | Show both VMs' status |
| `--clean` | Destroy both VMs and networks |

View File

@@ -0,0 +1,690 @@
//! OPNsense firewall pair integration example.
//!
//! Boots two OPNsense VMs, bootstraps both (login, SSH, webgui port),
//! then applies `FirewallPairTopology` + `CarpVipScore` for CARP HA testing.
//!
//! Both VMs share a LAN bridge but boot with the same default IP (.1).
//! The bootstrap sequence disables one VM's LAN NIC while bootstrapping
//! the other, then changes IPs via the API to avoid conflicts.
//!
//! # Usage
//!
//! ```bash
//! cargo run -p opnsense-pair-integration -- --check # verify prerequisites
//! cargo run -p opnsense-pair-integration -- --boot # boot + bootstrap both VMs
//! cargo run -p opnsense-pair-integration # run pair integration test
//! cargo run -p opnsense-pair-integration -- --full # boot + bootstrap + test (CI mode)
//! cargo run -p opnsense-pair-integration -- --status # check both VMs
//! cargo run -p opnsense-pair-integration -- --clean # tear down everything
//! ```
use std::net::IpAddr;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use harmony::config::secret::{OPNSenseApiCredentials, OPNSenseFirewallCredentials};
use harmony::infra::opnsense::OPNSenseFirewall;
use harmony::inventory::Inventory;
use harmony::modules::kvm::config::init_executor;
use harmony::modules::kvm::{
BootDevice, ForwardMode, KvmExecutor, NetworkConfig, NetworkRef, VmConfig,
};
use harmony::modules::opnsense::bootstrap::OPNsenseBootstrap;
use harmony::modules::opnsense::firewall::{FilterRuleDef, FirewallRuleScore};
use harmony::modules::opnsense::vip::VipDef;
use harmony::modules::opnsense::vlan::{VlanDef, VlanScore};
use harmony::score::Score;
use harmony::topology::{CarpVipScore, FirewallPairTopology, LogicalHost};
use harmony_types::firewall::{Direction, FirewallAction, IpProtocol, NetworkProtocol, VipMode};
use log::info;
const OPNSENSE_IMG_URL: &str =
"https://mirror.ams1.nl.leaseweb.net/opnsense/releases/26.1/OPNsense-26.1-nano-amd64.img.bz2";
const OPNSENSE_IMG_NAME: &str = "OPNsense-26.1-nano-amd64.img";
const VM_PRIMARY: &str = "opn-pair-primary";
const VM_BACKUP: &str = "opn-pair-backup";
const NET_LAN: &str = "opn-pair-lan";
/// Both VMs boot on this IP (OPNsense default, ignores injected config.xml).
/// We bootstrap one at a time by toggling LAN NICs, then change IPs via the API.
const BOOT_IP: &str = "192.168.1.1";
const HOST_IP: &str = "192.168.1.10";
/// After bootstrap, primary gets .2, backup gets .3, CARP VIP stays at .1
const PRIMARY_IP: &str = "192.168.1.2";
const BACKUP_IP: &str = "192.168.1.3";
const CARP_VIP: &str = "192.168.1.1";
const API_PORT: u16 = 9443;
const CARP_PASSWORD: &str = "pair-test-carp";
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
harmony_cli::cli_logger::init();
let args: Vec<String> = std::env::args().collect();
if args.iter().any(|a| a == "--check") {
return check_prerequisites();
}
if args.iter().any(|a| a == "--download") {
download_image().await?;
return Ok(());
}
let executor = init_executor()?;
if args.iter().any(|a| a == "--clean") {
return clean(&executor).await;
}
if args.iter().any(|a| a == "--status") {
return status(&executor).await;
}
if args.iter().any(|a| a == "--boot") {
let img_path = download_image().await?;
return boot_pair(&executor, &img_path).await;
}
if args.iter().any(|a| a == "--full") {
let img_path = download_image().await?;
boot_pair(&executor, &img_path).await?;
return run_pair_test().await;
}
// Default: run pair test (assumes VMs are bootstrapped)
check_prerequisites()?;
run_pair_test().await
}
// ── Phase 1: Boot and bootstrap both VMs ───────────────────────────
async fn boot_pair(
executor: &KvmExecutor,
img_path: &Path,
) -> Result<(), Box<dyn std::error::Error>> {
info!("Creating shared LAN network and two OPNsense VMs...");
// Create the shared LAN network
let network = NetworkConfig::builder(NET_LAN)
.bridge("virbr-pair")
.subnet(HOST_IP, 24)
.forward(ForwardMode::Nat)
.build();
executor.ensure_network(network).await?;
// Prepare disk images for both VMs
for vm_name in [VM_PRIMARY, VM_BACKUP] {
prepare_vm_disk(vm_name, img_path)?;
}
// Define and start both VMs (2 NICs each: LAN + WAN)
for vm_name in [VM_PRIMARY, VM_BACKUP] {
let disk = image_dir().join(format!("{vm_name}.qcow2"));
let vm = VmConfig::builder(vm_name)
.vcpus(1)
.memory_mib(1024)
.disk_from_path(disk.to_string_lossy().to_string())
.network(NetworkRef::named(NET_LAN)) // vtnet0 = LAN
.network(NetworkRef::named("default")) // vtnet1 = WAN
.boot_order([BootDevice::Disk])
.build();
executor.ensure_vm(vm).await?;
executor.start_vm(vm_name).await?;
}
// Get MAC addresses for LAN NICs (first interface on each VM)
let primary_interfaces = executor.list_interfaces(VM_PRIMARY).await?;
let backup_interfaces = executor.list_interfaces(VM_BACKUP).await?;
let primary_lan_mac = &primary_interfaces[0].mac;
let backup_lan_mac = &backup_interfaces[0].mac;
info!("Primary LAN MAC: {primary_lan_mac}, Backup LAN MAC: {backup_lan_mac}");
// ── Sequential bootstrap with NIC juggling ─────────────────────
//
// Both VMs boot on .1 (OPNsense default). We disable backup's LAN
// NIC so primary gets exclusive access to .1, bootstrap it, change
// its IP, then do the same for backup.
// Step 1: Disable backup's LAN NIC
info!("Disabling backup LAN NIC for primary bootstrap...");
executor
.set_interface_link(VM_BACKUP, backup_lan_mac, false)
.await?;
// Step 2: Wait for primary web UI and bootstrap
info!("Waiting for primary web UI at https://{BOOT_IP}...");
wait_for_https(BOOT_IP, 443).await?;
bootstrap_vm("primary", BOOT_IP).await?;
// Step 3: Change primary's LAN IP from .1 to .2 via API
info!("Changing primary LAN IP to {PRIMARY_IP}...");
change_lan_ip_via_ssh(BOOT_IP, PRIMARY_IP, 24).await?;
// Step 4: Wait for primary to come back on new IP
info!("Waiting for primary on new IP {PRIMARY_IP}:{API_PORT}...");
OPNsenseBootstrap::wait_for_ready(
&format!("https://{PRIMARY_IP}:{API_PORT}"),
std::time::Duration::from_secs(60),
)
.await?;
// Step 5: Disable primary's LAN NIC, enable backup's
info!("Swapping NICs: disabling primary, enabling backup...");
executor
.set_interface_link(VM_PRIMARY, primary_lan_mac, false)
.await?;
executor
.set_interface_link(VM_BACKUP, backup_lan_mac, true)
.await?;
// Step 6: Wait for backup web UI and bootstrap
info!("Waiting for backup web UI at https://{BOOT_IP}...");
wait_for_https(BOOT_IP, 443).await?;
bootstrap_vm("backup", BOOT_IP).await?;
// Step 7: Change backup's LAN IP from .1 to .3 via API
info!("Changing backup LAN IP to {BACKUP_IP}...");
change_lan_ip_via_ssh(BOOT_IP, BACKUP_IP, 24).await?;
// Step 8: Re-enable primary's LAN NIC
info!("Re-enabling primary LAN NIC...");
executor
.set_interface_link(VM_PRIMARY, primary_lan_mac, true)
.await?;
// Step 9: Wait for both to be reachable on their final IPs
info!("Waiting for both VMs on final IPs...");
OPNsenseBootstrap::wait_for_ready(
&format!("https://{PRIMARY_IP}:{API_PORT}"),
std::time::Duration::from_secs(60),
)
.await?;
OPNsenseBootstrap::wait_for_ready(
&format!("https://{BACKUP_IP}:{API_PORT}"),
std::time::Duration::from_secs(60),
)
.await?;
println!();
println!("OPNsense firewall pair is running and bootstrapped:");
println!(" Primary: https://{PRIMARY_IP}:{API_PORT} (root/opnsense)");
println!(" Backup: https://{BACKUP_IP}:{API_PORT} (root/opnsense)");
println!(" CARP VIP: {CARP_VIP} (will be configured by pair scores)");
println!();
println!("Run the pair integration test:");
println!(" cargo run -p opnsense-pair-integration");
Ok(())
}
async fn bootstrap_vm(role: &str, ip: &str) -> Result<(), Box<dyn std::error::Error>> {
info!("Bootstrapping {role} firewall at {ip}...");
let bootstrap = OPNsenseBootstrap::new(&format!("https://{ip}"));
bootstrap.login("root", "opnsense").await?;
bootstrap.abort_wizard().await?;
bootstrap.enable_ssh(true, true).await?;
bootstrap.set_webgui_port(API_PORT, ip, false).await?;
// Wait for webgui on new port
OPNsenseBootstrap::wait_for_ready(
&format!("https://{ip}:{API_PORT}"),
std::time::Duration::from_secs(120),
)
.await?;
// Verify SSH
for _ in 0..15 {
if check_tcp_port(ip, 22).await {
break;
}
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
if !check_tcp_port(ip, 22).await {
return Err(format!("SSH not reachable on {role} after bootstrap").into());
}
info!("{role} bootstrap complete");
Ok(())
}
/// Change the LAN interface IP via SSH (using OPNsense's ifconfig + config edit).
async fn change_lan_ip_via_ssh(
current_ip: &str,
new_ip: &str,
subnet: u8,
) -> Result<(), Box<dyn std::error::Error>> {
use opnsense_config::config::{OPNsenseShell, SshCredentials, SshOPNSenseShell};
let ssh_config = Arc::new(russh::client::Config {
inactivity_timeout: None,
..<_>::default()
});
let credentials = SshCredentials::Password {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let ip: IpAddr = current_ip.parse()?;
let shell = SshOPNSenseShell::new((ip, 22), credentials, ssh_config);
// Use a PHP script to update config.xml and apply
let php_script = format!(
r#"<?php
require_once '/usr/local/etc/inc/config.inc';
$config = OPNsense\Core\Config::getInstance();
$config->object()->interfaces->lan->ipaddr = '{new_ip}';
$config->object()->interfaces->lan->subnet = '{subnet}';
$config->save();
echo "OK\n";
"#
);
shell
.write_content_to_file(&php_script, "/tmp/change_ip.php")
.await?;
let output = shell
.exec("php /tmp/change_ip.php && rm /tmp/change_ip.php && configctl interface reconfigure lan")
.await?;
info!("IP change result: {}", output.trim());
Ok(())
}
// ── Phase 2: Pair integration test ─────────────────────────────────
async fn run_pair_test() -> Result<(), Box<dyn std::error::Error>> {
// Verify both VMs are reachable
info!("Checking primary at {PRIMARY_IP}:{API_PORT}...");
if !check_tcp_port(PRIMARY_IP, API_PORT).await {
return Err(format!("Primary not reachable at {PRIMARY_IP}:{API_PORT}").into());
}
info!("Checking backup at {BACKUP_IP}:{API_PORT}...");
if !check_tcp_port(BACKUP_IP, API_PORT).await {
return Err(format!("Backup not reachable at {BACKUP_IP}:{API_PORT}").into());
}
// Create API keys on both
info!("Creating API keys...");
let primary_ip: IpAddr = PRIMARY_IP.parse()?;
let backup_ip: IpAddr = BACKUP_IP.parse()?;
let (primary_key, primary_secret) = create_api_key_ssh(&primary_ip).await?;
let (backup_key, backup_secret) = create_api_key_ssh(&backup_ip).await?;
info!("API keys created for both firewalls");
// Build FirewallPairTopology
let primary_host = LogicalHost {
ip: primary_ip.into(),
name: VM_PRIMARY.to_string(),
};
let backup_host = LogicalHost {
ip: backup_ip.into(),
name: VM_BACKUP.to_string(),
};
let primary_api_creds = OPNSenseApiCredentials {
key: primary_key.clone(),
secret: primary_secret.clone(),
};
let backup_api_creds = OPNSenseApiCredentials {
key: backup_key.clone(),
secret: backup_secret.clone(),
};
let ssh_creds = OPNSenseFirewallCredentials {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let primary_fw = OPNSenseFirewall::with_api_port(
primary_host,
None,
API_PORT,
&primary_api_creds,
&ssh_creds,
)
.await;
let backup_fw =
OPNSenseFirewall::with_api_port(backup_host, None, API_PORT, &backup_api_creds, &ssh_creds)
.await;
let pair = FirewallPairTopology {
primary: primary_fw,
backup: backup_fw,
};
// Build pair scores
let carp_score = CarpVipScore {
vips: vec![VipDef {
mode: VipMode::Carp,
interface: "lan".to_string(),
subnet: CARP_VIP.to_string(),
subnet_bits: 24,
vhid: Some(1),
advbase: Some(1),
advskew: None, // handled by CarpVipScore (primary=0, backup=100)
password: Some(CARP_PASSWORD.to_string()),
peer: None,
}],
backup_advskew: Some(100),
};
let vlan_score = VlanScore {
vlans: vec![VlanDef {
parent_interface: "vtnet0".to_string(),
tag: 100,
description: "pair-test-vlan-100".to_string(),
}],
};
let fw_rule_score = FirewallRuleScore {
rules: vec![FilterRuleDef {
action: FirewallAction::Pass,
direction: Direction::In,
interface: "lan".to_string(),
ip_protocol: IpProtocol::Inet,
protocol: NetworkProtocol::Icmp,
source_net: "any".to_string(),
destination_net: "any".to_string(),
destination_port: None,
gateway: None,
description: "pair-test-allow-icmp".to_string(),
log: false,
}],
};
// Run pair scores
info!("Running pair scores...");
let scores: Vec<Box<dyn Score<FirewallPairTopology>>> = vec![
Box::new(carp_score),
Box::new(vlan_score),
Box::new(fw_rule_score),
];
let args = harmony_cli::Args {
yes: true,
filter: None,
interactive: false,
all: true,
number: 0,
list: false,
};
harmony_cli::run_cli(Inventory::autoload(), pair, scores, args).await?;
// Verify CARP VIPs via API
info!("Verifying CARP VIPs...");
let primary_client = opnsense_api::OpnsenseClient::builder()
.base_url(format!("https://{PRIMARY_IP}:{API_PORT}/api"))
.auth_from_key_secret(&primary_key, &primary_secret)
.skip_tls_verify()
.timeout_secs(60)
.build()?;
let backup_client = opnsense_api::OpnsenseClient::builder()
.base_url(format!("https://{BACKUP_IP}:{API_PORT}/api"))
.auth_from_key_secret(&backup_key, &backup_secret)
.skip_tls_verify()
.timeout_secs(60)
.build()?;
let primary_vips: serde_json::Value = primary_client
.get_typed("interfaces", "vip_settings", "searchItem")
.await?;
let backup_vips: serde_json::Value = backup_client
.get_typed("interfaces", "vip_settings", "searchItem")
.await?;
let primary_vip_count = primary_vips["rowCount"].as_i64().unwrap_or(0);
let backup_vip_count = backup_vips["rowCount"].as_i64().unwrap_or(0);
info!(" Primary VIPs: {primary_vip_count}");
info!(" Backup VIPs: {backup_vip_count}");
assert!(primary_vip_count >= 1, "Primary should have at least 1 VIP");
assert!(backup_vip_count >= 1, "Backup should have at least 1 VIP");
// Verify VLANs on both
let primary_vlans: serde_json::Value = primary_client
.get_typed("interfaces", "vlan_settings", "get")
.await?;
let backup_vlans: serde_json::Value = backup_client
.get_typed("interfaces", "vlan_settings", "get")
.await?;
let p_vlan_count = primary_vlans["vlan"]["vlan"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
let b_vlan_count = backup_vlans["vlan"]["vlan"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
info!(" Primary VLANs: {p_vlan_count}");
info!(" Backup VLANs: {b_vlan_count}");
assert!(p_vlan_count >= 1, "Primary should have at least 1 VLAN");
assert!(b_vlan_count >= 1, "Backup should have at least 1 VLAN");
println!();
println!("PASSED - OPNsense firewall pair integration test:");
println!(
" - CarpVipScore: CARP VIP {CARP_VIP} on both (primary advskew=0, backup advskew=100)"
);
println!(" - VlanScore: VLAN 100 on both");
println!(" - FirewallRuleScore: ICMP allow on both");
println!();
println!("VMs are running. Use --clean to tear down.");
Ok(())
}
// ── Helpers ────────────────────────────────────────────────────────
fn prepare_vm_disk(vm_name: &str, img_path: &Path) -> Result<(), Box<dyn std::error::Error>> {
let vm_raw = image_dir().join(format!("{vm_name}.img"));
if !vm_raw.exists() {
info!("Copying nano image for {vm_name}...");
std::fs::copy(img_path, &vm_raw)?;
info!("Injecting config.xml for {vm_name}...");
let config =
harmony::modules::opnsense::image::minimal_config_xml("vtnet1", "vtnet0", BOOT_IP, 24);
harmony::modules::opnsense::image::replace_config_xml(&vm_raw, &config)?;
}
let vm_disk = image_dir().join(format!("{vm_name}.qcow2"));
if !vm_disk.exists() {
info!("Converting {vm_name} to qcow2...");
run_cmd(
"qemu-img",
&[
"convert",
"-f",
"raw",
"-O",
"qcow2",
&vm_raw.to_string_lossy(),
&vm_disk.to_string_lossy(),
],
)?;
run_cmd("qemu-img", &["resize", &vm_disk.to_string_lossy(), "4G"])?;
}
Ok(())
}
fn check_prerequisites() -> Result<(), Box<dyn std::error::Error>> {
let mut ok = true;
for (cmd, test_args) in [
("virsh", vec!["-c", "qemu:///system", "version"]),
("qemu-img", vec!["--version"]),
("bunzip2", vec!["--help"]),
] {
match std::process::Command::new(cmd).args(&test_args).output() {
Ok(out) if out.status.success() => println!("[ok] {cmd}"),
_ => {
println!("[FAIL] {cmd}");
ok = false;
}
}
}
if !ok {
return Err("Prerequisites not met".into());
}
println!("All prerequisites met.");
Ok(())
}
fn run_cmd(cmd: &str, args: &[&str]) -> Result<(), Box<dyn std::error::Error>> {
let status = std::process::Command::new(cmd).args(args).status()?;
if !status.success() {
return Err(format!("{cmd} failed").into());
}
Ok(())
}
fn image_dir() -> PathBuf {
let dir = std::env::var("HARMONY_KVM_IMAGE_DIR").unwrap_or_else(|_| {
dirs::data_dir()
.unwrap_or_else(|| PathBuf::from("/tmp"))
.join("harmony")
.join("kvm")
.join("images")
.to_string_lossy()
.to_string()
});
PathBuf::from(dir)
}
async fn download_image() -> Result<PathBuf, Box<dyn std::error::Error>> {
let dir = image_dir();
std::fs::create_dir_all(&dir)?;
let img_path = dir.join(OPNSENSE_IMG_NAME);
if img_path.exists() {
info!("Image cached: {}", img_path.display());
return Ok(img_path);
}
let bz2_path = dir.join(format!("{OPNSENSE_IMG_NAME}.bz2"));
if !bz2_path.exists() {
info!("Downloading OPNsense nano image (~350MB)...");
let response = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(600))
.build()?
.get(OPNSENSE_IMG_URL)
.send()
.await?;
if !response.status().is_success() {
return Err(format!("Download failed: HTTP {}", response.status()).into());
}
let bytes = response.bytes().await?;
std::fs::write(&bz2_path, &bytes)?;
}
info!("Decompressing...");
run_cmd("bunzip2", &["--keep", &bz2_path.to_string_lossy()])?;
Ok(img_path)
}
async fn clean(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
info!("Cleaning up pair integration...");
for vm_name in [VM_PRIMARY, VM_BACKUP] {
let _ = executor.destroy_vm(vm_name).await;
let _ = executor.undefine_vm(vm_name).await;
for ext in ["img", "qcow2"] {
let path = image_dir().join(format!("{vm_name}.{ext}"));
if path.exists() {
std::fs::remove_file(&path)?;
info!("Removed: {}", path.display());
}
}
}
let _ = executor.delete_network(NET_LAN).await;
info!("Done.");
Ok(())
}
async fn status(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
for (vm_name, ip) in [(VM_PRIMARY, PRIMARY_IP), (VM_BACKUP, BACKUP_IP)] {
match executor.vm_status(vm_name).await {
Ok(s) => {
let api = check_tcp_port(ip, API_PORT).await;
let ssh = check_tcp_port(ip, 22).await;
println!("{vm_name}: {s:?}");
println!(" LAN IP: {ip}");
println!(
" API: {}",
if api { "responding" } else { "not responding" }
);
println!(
" SSH: {}",
if ssh { "responding" } else { "not responding" }
);
}
Err(_) => println!("{vm_name}: not found"),
}
}
Ok(())
}
async fn wait_for_https(ip: &str, port: u16) -> Result<(), Box<dyn std::error::Error>> {
let client = reqwest::Client::builder()
.danger_accept_invalid_certs(true)
.timeout(std::time::Duration::from_secs(5))
.build()?;
let url = format!("https://{ip}:{port}");
for i in 0..60 {
if client.get(&url).send().await.is_ok() {
info!("Web UI responding at {url} (attempt {i})");
return Ok(());
}
if i % 10 == 0 {
info!("Waiting for {url}... (attempt {i})");
}
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
}
Err(format!("{url} did not respond within 5 minutes").into())
}
async fn check_tcp_port(ip: &str, port: u16) -> bool {
tokio::time::timeout(
std::time::Duration::from_secs(3),
tokio::net::TcpStream::connect(format!("{ip}:{port}")),
)
.await
.map(|r| r.is_ok())
.unwrap_or(false)
}
async fn create_api_key_ssh(ip: &IpAddr) -> Result<(String, String), Box<dyn std::error::Error>> {
use opnsense_config::config::{OPNsenseShell, SshCredentials, SshOPNSenseShell};
let ssh_config = Arc::new(russh::client::Config {
inactivity_timeout: None,
..<_>::default()
});
let credentials = SshCredentials::Password {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let shell = SshOPNSenseShell::new((*ip, 22), credentials, ssh_config);
let php_script = r#"<?php
require_once '/usr/local/etc/inc/config.inc';
$key = bin2hex(random_bytes(20));
$secret = bin2hex(random_bytes(40));
$config = OPNsense\Core\Config::getInstance();
foreach ($config->object()->system->user as $user) {
if ((string)$user->name === 'root') {
if (!isset($user->apikeys)) { $user->addChild('apikeys'); }
$item = $user->apikeys->addChild('item');
$item->addChild('key', $key);
$item->addChild('secret', crypt($secret, '$6$' . bin2hex(random_bytes(8)) . '$'));
$config->save();
echo $key . "\n" . $secret . "\n";
exit(0);
}
}
echo "ERROR: root user not found\n";
exit(1);
"#;
shell
.write_content_to_file(php_script, "/tmp/create_api_key.php")
.await?;
let output = shell
.exec("php /tmp/create_api_key.php && rm /tmp/create_api_key.php")
.await?;
let lines: Vec<&str> = output.trim().lines().collect();
if lines.len() >= 2 && !lines[0].starts_with("ERROR") {
Ok((lines[0].to_string(), lines[1].to_string()))
} else {
Err(format!("API key creation failed: {output}").into())
}
}

View File

@@ -0,0 +1,25 @@
[package]
name = "opnsense-vm-integration"
version.workspace = true
edition = "2024"
license.workspace = true
[[bin]]
name = "opnsense-vm-integration"
path = "src/main.rs"
[dependencies]
harmony = { path = "../../harmony" }
harmony_cli = { path = "../../harmony_cli" }
harmony_inventory_agent = { path = "../../harmony_inventory_agent" }
harmony_macros = { path = "../../harmony_macros" }
harmony_types = { path = "../../harmony_types" }
opnsense-api = { path = "../../opnsense-api" }
opnsense-config = { path = "../../opnsense-config" }
tokio.workspace = true
log.workspace = true
env_logger.workspace = true
reqwest.workspace = true
russh.workspace = true
serde_json.workspace = true
dirs = "6"

View File

@@ -0,0 +1,151 @@
# OPNsense VM Integration Example
Fully automated end-to-end integration test: boots an OPNsense VM via KVM, bootstraps SSH and API access without any manual browser interaction, installs packages, and runs 11 Harmony Scores against it. CI-friendly.
## Quick start
```bash
# 1. One-time setup (libvirt, Docker compatibility)
./examples/opnsense_vm_integration/setup-libvirt.sh
# 2. Verify prerequisites
cargo run -p opnsense-vm-integration -- --check
# 3. Boot + bootstrap + integration test (fully unattended)
cargo run -p opnsense-vm-integration -- --full
# 4. Clean up
cargo run -p opnsense-vm-integration -- --clean
```
That's it. No browser clicks, no manual SSH setup, no wizard interaction.
## What happens during `--full`
1. Downloads OPNsense 26.1 nano image (~350MB, cached after first download)
2. Injects `config.xml` with virtio interface assignments (vtnet0=LAN, vtnet1=WAN)
3. Creates a 4 GiB qcow2 disk and boots via KVM (1 vCPU, 1GB RAM, 4 NICs)
4. Waits for web UI to respond (~20s)
5. **Automated bootstrap** via `OPNsenseBootstrap`:
- Logs in (root/opnsense) with CSRF token handling
- Aborts the initial setup wizard
- Enables SSH with root login and password auth
- Changes web GUI port to 9443 (avoids HAProxy conflicts)
- Restarts lighttpd via SSH to apply the port change
6. Creates OPNsense API key via SSH (PHP script)
7. Installs `os-haproxy` via firmware API
8. Runs 11 Scores configuring the entire firewall
9. Verifies all configurations via REST API assertions
## Step-by-step mode
If you prefer to separate boot and test:
```bash
# Boot + bootstrap (creates VM, enables SSH, sets port)
cargo run -p opnsense-vm-integration -- --boot
# Run integration test (assumes VM is bootstrapped)
cargo run -p opnsense-vm-integration
# Check VM status at any time
cargo run -p opnsense-vm-integration -- --status
```
## Prerequisites
### System requirements
- **Linux** with KVM support (Intel VT-x/AMD-V)
- **~10 GB** free disk space
- **~15 minutes** for first run (image download + firmware update)
- Subsequent runs: ~2 minutes
### Required packages
**Arch/Manjaro:**
```bash
sudo pacman -S libvirt qemu-full dnsmasq
```
**Fedora:**
```bash
sudo dnf install libvirt qemu-kvm dnsmasq
```
**Ubuntu/Debian:**
```bash
sudo apt install libvirt-daemon-system qemu-kvm dnsmasq
```
### Automated setup
```bash
./examples/opnsense_vm_integration/setup-libvirt.sh
```
This handles: user group membership, libvirtd startup, default storage pool, Docker FORWARD policy conflict.
After running setup, apply group membership:
```bash
newgrp libvirt
```
### Docker + libvirt compatibility
Docker sets the iptables FORWARD policy to DROP, which blocks libvirt's NAT networking. The setup script detects this and switches libvirt to the iptables firewall backend so both coexist.
## Scores applied
| # | Score | What it configures |
|---|-------|--------------------|
| 1 | `LoadBalancerScore` | HAProxy with 2 frontends, backends with TCP health checks |
| 2 | `DhcpScore` | DHCP range, 2 static host bindings, PXE boot options |
| 3 | `TftpScore` | TFTP server serving boot files |
| 4 | `NodeExporterScore` | Prometheus node exporter |
| 5 | `VlanScore` | 2 VLANs (tags 100, 200) on vtnet0 |
| 6 | `FirewallRuleScore` | Firewall filter rules with logging |
| 7 | `OutboundNatScore` | Source NAT for outbound traffic |
| 8 | `BinatScore` | Bidirectional 1:1 NAT |
| 9 | `VipScore` | Virtual IPs (IP aliases) |
| 10 | `DnatScore` | Port forwarding rules |
| 11 | `LaggScore` | Link aggregation (vtnet2+vtnet3) |
All Scores are idempotent: running them twice produces the same result.
## Network architecture
```
Host (192.168.1.10) --- virbr-opn bridge --- OPNsense LAN (192.168.1.1)
192.168.1.0/24 vtnet0
NAT to internet
--- virbr0 (default) --- OPNsense WAN (DHCP)
192.168.122.0/24 vtnet1
NAT to internet
```
## Environment variables
| Variable | Default | Description |
|----------|---------|-------------|
| `RUST_LOG` | (unset) | Log level: `info`, `debug`, `trace` |
| `HARMONY_KVM_URI` | `qemu:///system` | Libvirt connection URI |
| `HARMONY_KVM_IMAGE_DIR` | `~/.local/share/harmony/kvm/images` | Cached disk images |
## Troubleshooting
**VM won't start / permission denied**
Ensure your user is in the `libvirt` group and that the image directory is traversable by the qemu user. The setup script handles this.
**192.168.1.0/24 conflict**
If your host network already uses this subnet, the VM will be unreachable. Edit the constants in `src/main.rs` to use a different subnet.
**HAProxy install fails**
OPNsense may need a firmware update first. The integration test attempts this automatically. If it fails, connect to the web UI at https://192.168.1.1:9443 and update manually.
**Serial console access**
```bash
virsh -c qemu:///system console opn-integration
# Press Ctrl+] to exit
```

View File

@@ -0,0 +1,140 @@
#!/bin/bash
set -euo pipefail
# Setup sudo-less libvirt access for KVM-based harmony examples.
#
# Run once on a fresh machine. After this, all KVM operations work
# without sudo — libvirt authenticates via group membership.
#
# Usage:
# ./setup-libvirt.sh # interactive, asks before each step
# ./setup-libvirt.sh --yes # non-interactive, runs everything
USER="${USER:-$(whoami)}"
AUTO_YES=false
[[ "${1:-}" == "--yes" ]] && AUTO_YES=true
green() { printf '\033[32m%s\033[0m\n' "$*"; }
red() { printf '\033[31m%s\033[0m\n' "$*"; }
bold() { printf '\033[1m%s\033[0m\n' "$*"; }
confirm() {
if $AUTO_YES; then return 0; fi
read -rp "$1 [Y/n] " answer
[[ -z "$answer" || "$answer" =~ ^[Yy] ]]
}
bold "Harmony KVM/libvirt setup"
echo
# ── Step 1: Install packages ────────────────────────────────────────────
echo "Checking required packages..."
MISSING=()
for pkg in qemu-full libvirt dnsmasq ebtables; do
if ! pacman -Qi "$pkg" &>/dev/null; then
MISSING+=("$pkg")
fi
done
if [[ ${#MISSING[@]} -gt 0 ]]; then
echo "Missing packages: ${MISSING[*]}"
if confirm "Install them?"; then
sudo pacman -S --needed "${MISSING[@]}"
else
red "Skipped package installation"
fi
else
green "[ok] All packages installed"
fi
# ── Step 2: Add user to libvirt group ────────────────────────────────────
if groups "$USER" 2>/dev/null | grep -qw libvirt; then
green "[ok] $USER is in libvirt group"
else
echo "$USER is NOT in the libvirt group"
if confirm "Add $USER to libvirt group?"; then
sudo usermod -aG libvirt "$USER"
green "[ok] Added $USER to libvirt group"
echo " Note: you need to log out and back in (or run 'newgrp libvirt') for this to take effect"
fi
fi
# ── Step 3: Start libvirtd ───────────────────────────────────────────────
if systemctl is-active --quiet libvirtd; then
green "[ok] libvirtd is running"
else
echo "libvirtd is not running"
if confirm "Enable and start libvirtd?"; then
sudo systemctl enable --now libvirtd
green "[ok] libvirtd started"
fi
fi
# ── Step 4: Default storage pool ─────────────────────────────────────────
if virsh -c qemu:///system pool-info default &>/dev/null; then
green "[ok] Default storage pool exists"
else
echo "Default storage pool does not exist"
if confirm "Create default storage pool at /var/lib/libvirt/images?"; then
sudo virsh pool-define-as default dir --target /var/lib/libvirt/images
sudo virsh pool-autostart default
sudo virsh pool-start default
green "[ok] Default storage pool created"
fi
fi
# ── Step 5: Fix Docker + libvirt FORWARD conflict ────────────────────────
# Docker sets iptables FORWARD policy to DROP, which blocks libvirt NAT.
# Libvirt defaults to nftables which doesn't interact with Docker's iptables.
# Fix: switch libvirt to iptables backend so rules coexist with Docker.
if docker info &>/dev/null; then
echo "Docker detected."
NETCONF="/etc/libvirt/network.conf"
if grep -q '^firewall_backend' "$NETCONF" 2>/dev/null; then
CURRENT=$(grep '^firewall_backend' "$NETCONF" | head -1)
if echo "$CURRENT" | grep -q 'iptables'; then
green "[ok] libvirt firewall_backend is already iptables"
else
echo "libvirt firewall_backend is: $CURRENT"
echo "Docker's iptables FORWARD DROP will block libvirt NAT."
if confirm "Switch libvirt to iptables backend?"; then
sudo sed -i 's/^firewall_backend.*/firewall_backend = "iptables"/' "$NETCONF"
echo "Restarting libvirtd to apply..."
sudo systemctl restart libvirtd
green "[ok] Switched to iptables backend"
fi
fi
else
echo "libvirt uses nftables (default), but Docker's iptables FORWARD DROP blocks NAT."
if confirm "Set libvirt to use iptables backend (recommended with Docker)?"; then
echo 'firewall_backend = "iptables"' | sudo tee -a "$NETCONF" >/dev/null
echo "Restarting libvirtd to apply..."
sudo systemctl restart libvirtd
# Re-activate networks so they get iptables rules
for net in $(virsh -c qemu:///system net-list --name 2>/dev/null); do
virsh -c qemu:///system net-destroy "$net" 2>/dev/null
virsh -c qemu:///system net-start "$net" 2>/dev/null
done
green "[ok] Switched to iptables backend and restarted networks"
fi
fi
else
green "[ok] Docker not detected, no FORWARD conflict"
fi
# ── Done ─────────────────────────────────────────────────────────────────
echo
bold "Setup complete."
echo
echo "If you were added to the libvirt group, apply it now:"
echo " newgrp libvirt"
echo
echo "Then verify:"
echo " cargo run -p opnsense-vm-integration -- --check"

View File

@@ -0,0 +1,937 @@
//! OPNsense VM integration example.
//!
//! Fully unattended workflow — no manual browser interaction required:
//!
//! 1. `--boot` — creates a KVM VM, waits for web UI, bootstraps SSH + webgui port
//! 2. (default run) — creates API key via SSH, installs packages, runs Scores
//! 3. `--full` — does both in a single invocation (CI-friendly)
//!
//! # Usage
//!
//! ```bash
//! cargo run -p opnsense-vm-integration -- --check # verify prerequisites
//! cargo run -p opnsense-vm-integration -- --download # download OPNsense image
//! cargo run -p opnsense-vm-integration -- --boot # create VM + automated bootstrap
//! cargo run -p opnsense-vm-integration # run integration test
//! cargo run -p opnsense-vm-integration -- --full # boot + bootstrap + test (CI mode)
//! cargo run -p opnsense-vm-integration -- --status # check VM state
//! cargo run -p opnsense-vm-integration -- --clean # tear down everything
//! ```
use std::net::IpAddr;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use harmony::config::secret::{OPNSenseApiCredentials, OPNSenseFirewallCredentials};
use harmony::hardware::{HostCategory, PhysicalHost};
use harmony::infra::opnsense::OPNSenseFirewall;
use harmony::inventory::Inventory;
use harmony::modules::dhcp::DhcpScore;
use harmony::modules::kvm::config::init_executor;
use harmony::modules::kvm::{
BootDevice, ForwardMode, KvmExecutor, NetworkConfig, NetworkRef, VmConfig,
};
use harmony::modules::load_balancer::LoadBalancerScore;
use harmony::modules::opnsense::bootstrap::OPNsenseBootstrap;
use harmony::modules::opnsense::dnat::{DnatRuleDef, DnatScore};
use harmony::modules::opnsense::firewall::{
BinatRuleDef, BinatScore, FilterRuleDef, FirewallRuleScore, OutboundNatScore, SnatRuleDef,
};
use harmony::modules::opnsense::lagg::{LaggDef, LaggScore};
use harmony::modules::opnsense::node_exporter::NodeExporterScore;
use harmony::modules::opnsense::vip::{VipDef, VipScore};
use harmony::modules::opnsense::vlan::{VlanDef, VlanScore};
use harmony::modules::tftp::TftpScore;
use harmony::score::Score;
use harmony::topology::{
BackendServer, HealthCheck, HostBinding, HostConfig, LoadBalancerService, LogicalHost,
};
use harmony_inventory_agent::hwinfo::NetworkInterface;
use harmony_macros::ip;
use harmony_types::firewall::{
Direction, FirewallAction, IpProtocol, LaggProtocol, NetworkProtocol, VipMode,
};
use harmony_types::id::Id;
use harmony_types::net::{MacAddress, Url};
use log::{info, warn};
const OPNSENSE_IMG_URL: &str =
"https://mirror.ams1.nl.leaseweb.net/opnsense/releases/26.1/OPNsense-26.1-nano-amd64.img.bz2";
const OPNSENSE_IMG_NAME: &str = "OPNsense-26.1-nano-amd64.img";
const VM_NAME: &str = "opn-integration";
const NET_NAME: &str = "opn-test";
// OPNsense nano defaults LAN to 192.168.1.1/24.
// The libvirt network uses the same subnet so the host can reach the VM.
const HOST_IP: &str = "192.168.1.10";
const OPN_LAN_IP: &str = "192.168.1.1";
/// Web GUI/API port — moved from 443 to avoid HAProxy conflicts.
/// Set in the manual step: System > Settings > Administration > TCP Port.
const OPN_API_PORT: u16 = 9443;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
harmony_cli::cli_logger::init();
let args: Vec<String> = std::env::args().collect();
if args.iter().any(|a| a == "--setup") {
print_setup();
return Ok(());
}
if args.iter().any(|a| a == "--check") {
return check_prerequisites();
}
if args.iter().any(|a| a == "--download") {
download_image().await?;
return Ok(());
}
let executor = init_executor()?;
if args.iter().any(|a| a == "--clean") {
return clean(&executor).await;
}
if args.iter().any(|a| a == "--status") {
return status(&executor).await;
}
if args.iter().any(|a| a == "--boot") {
let img_path = download_image().await?;
return boot_vm(&executor, &img_path).await;
}
if args.iter().any(|a| a == "--full") {
// CI mode: boot + bootstrap + integration test in one shot
let img_path = download_image().await?;
boot_vm(&executor, &img_path).await?;
return run_integration().await;
}
// Default: run the integration test (assumes VM is booted + bootstrapped)
check_prerequisites()?;
run_integration().await
}
// ── Phase 1: Boot VM ────────────────────────────────────────────────────
async fn boot_vm(
executor: &KvmExecutor,
img_path: &Path,
) -> Result<(), Box<dyn std::error::Error>> {
info!("Creating network and OPNsense VM...");
let network = NetworkConfig::builder(NET_NAME)
.bridge("virbr-opn")
.subnet(HOST_IP, 24)
.forward(ForwardMode::Nat)
.build();
executor.ensure_network(network).await?;
// Copy and convert the nano image
let vm_raw = image_dir().join(format!("{VM_NAME}-boot.img"));
if !vm_raw.exists() {
info!("Copying nano image...");
std::fs::copy(img_path, &vm_raw)?;
// Inject config.xml with virtio interface names
info!("Injecting config.xml for virtio NICs...");
let config = harmony::modules::opnsense::image::minimal_config_xml(
"vtnet1", "vtnet0", OPN_LAN_IP, 24,
);
harmony::modules::opnsense::image::replace_config_xml(&vm_raw, &config)?;
}
let vm_disk = image_dir().join(format!("{VM_NAME}-boot.qcow2"));
if !vm_disk.exists() {
info!("Converting to qcow2...");
run_cmd(
"qemu-img",
&[
"convert",
"-f",
"raw",
"-O",
"qcow2",
&vm_raw.to_string_lossy(),
&vm_disk.to_string_lossy(),
],
)?;
run_cmd("qemu-img", &["resize", &vm_disk.to_string_lossy(), "4G"])?;
}
let vm = VmConfig::builder(VM_NAME)
.vcpus(1)
.memory_mib(1024)
.disk_from_path(vm_disk.to_string_lossy().to_string())
.network(NetworkRef::named(NET_NAME)) // vtnet0 = LAN
.network(NetworkRef::named("default")) // vtnet1 = WAN
.network(NetworkRef::named(NET_NAME)) // vtnet2 = LAGG member 1
.network(NetworkRef::named(NET_NAME)) // vtnet3 = LAGG member 2
.boot_order([BootDevice::Disk])
.build();
executor.ensure_vm(vm).await?;
executor.start_vm(VM_NAME).await?;
info!("VM started. Waiting for web UI at https://{OPN_LAN_IP} ...");
wait_for_https(OPN_LAN_IP, 443).await?;
// ── Automated bootstrap (replaces manual browser interaction) ───
info!("Bootstrapping OPNsense: login, abort wizard, enable SSH, set webgui port...");
let bootstrap = OPNsenseBootstrap::new(&format!("https://{OPN_LAN_IP}"));
bootstrap.login("root", "opnsense").await?;
bootstrap.abort_wizard().await?;
bootstrap.enable_ssh(true, true).await?;
bootstrap
.set_webgui_port(OPN_API_PORT, OPN_LAN_IP, false)
.await?;
// Wait for the web UI to come back on the new port
info!("Waiting for web UI on new port {OPN_API_PORT}...");
if let Err(e) = OPNsenseBootstrap::wait_for_ready(
&format!("https://{OPN_LAN_IP}:{OPN_API_PORT}"),
std::time::Duration::from_secs(120),
)
.await
{
warn!("Web UI did not come up on port {OPN_API_PORT}: {e}");
info!("Running diagnostics via SSH...");
match OPNsenseBootstrap::diagnose_via_ssh(OPN_LAN_IP).await {
Ok(report) => {
info!("Diagnostic report:\n{}", report);
}
Err(diag_err) => warn!("Diagnostics failed: {diag_err}"),
}
return Err(e.into());
}
// Verify SSH is reachable
info!("Verifying SSH is reachable...");
for _ in 0..30 {
if check_tcp_port(OPN_LAN_IP, 22).await {
break;
}
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
if !check_tcp_port(OPN_LAN_IP, 22).await {
return Err("SSH did not become reachable after bootstrap".into());
}
println!();
println!("OPNsense VM is running and fully bootstrapped:");
println!(" Web UI: https://{OPN_LAN_IP}:{OPN_API_PORT}");
println!(" SSH: root@{OPN_LAN_IP} (password: opnsense)");
println!(" Login: root / opnsense");
println!();
println!("Run the integration test:");
println!(" cargo run -p opnsense-vm-integration");
println!();
println!("Or use --full to boot + test in one shot (CI mode):");
println!(" cargo run -p opnsense-vm-integration -- --full");
Ok(())
}
// ── Phase 2: Integration test ───────────────────────────────────────────
async fn run_integration() -> Result<(), Box<dyn std::error::Error>> {
let vm_ip: IpAddr = OPN_LAN_IP.parse().unwrap();
// Verify SSH is reachable (bootstrap should have enabled it)
info!("Checking SSH at {OPN_LAN_IP}:22...");
if !check_tcp_port(OPN_LAN_IP, 22).await {
eprintln!("SSH is not reachable at {OPN_LAN_IP}:22");
eprintln!("Run '--boot' first (it will automatically enable SSH).");
return Err("SSH not available".into());
}
info!("SSH is reachable");
// Create API key
info!("Creating API key via SSH...");
let (api_key, api_secret) = create_api_key_ssh(&vm_ip).await?;
info!("API key created: {}...", &api_key[..api_key.len().min(12)]);
// Build topology
let firewall_host = LogicalHost {
ip: vm_ip.into(),
name: VM_NAME.to_string(),
};
let api_creds = OPNSenseApiCredentials {
key: api_key.clone(),
secret: api_secret.clone(),
};
let ssh_creds = OPNSenseFirewallCredentials {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let opnsense =
OPNSenseFirewall::with_api_port(firewall_host, None, OPN_API_PORT, &api_creds, &ssh_creds)
.await;
// Install packages
let config = opnsense.get_opnsense_config();
if !config.is_package_installed("os-haproxy").await {
info!("Installing os-haproxy (may need firmware update first)...");
match config.install_package("os-haproxy").await {
Ok(()) => info!("os-haproxy installed"),
Err(e) => {
warn!("os-haproxy install failed: {e}");
info!("Attempting firmware update...");
// Trigger firmware update then retry
let _: serde_json::Value = config
.client()
.post_typed("core", "firmware", "update", None::<&()>)
.await
.map_err(|e| format!("firmware update failed: {e}"))?;
// Poll for completion
for _ in 0..120 {
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
let status: serde_json::Value = match config
.client()
.get_typed("core", "firmware", "upgradestatus")
.await
{
Ok(s) => s,
Err(_) => continue, // VM may be rebooting
};
if status["status"].as_str() == Some("done")
|| status["status"].as_str() == Some("reboot")
{
break;
}
}
info!("Firmware updated, retrying package install...");
// Wait for API to come back if reboot needed
wait_for_https(OPN_LAN_IP, 443).await?;
config.install_package("os-haproxy").await?;
}
}
} else {
info!("os-haproxy already installed");
}
// ── Build all Scores ──────────────────────────────────────────────
// 1. LoadBalancerScore — HAProxy with 2 frontends
let lb_score = LoadBalancerScore {
public_services: vec![
LoadBalancerService {
listening_port: format!("{OPN_LAN_IP}:16443").parse()?,
backend_servers: vec![
BackendServer {
address: "10.50.0.10".into(),
port: 6443,
},
BackendServer {
address: "10.50.0.11".into(),
port: 6443,
},
BackendServer {
address: "10.50.0.12".into(),
port: 6443,
},
],
health_check: Some(HealthCheck::TCP(None)),
},
LoadBalancerService {
listening_port: format!("{OPN_LAN_IP}:18443").parse()?,
backend_servers: vec![
BackendServer {
address: "10.50.0.10".into(),
port: 443,
},
BackendServer {
address: "10.50.0.11".into(),
port: 443,
},
],
health_check: Some(HealthCheck::TCP(None)),
},
],
private_services: vec![],
wan_firewall_ports: vec![],
};
// 2. DhcpScore — DHCP range + 2 static host bindings
let dhcp_score = DhcpScore::new(
vec![
make_host_binding(
"node1",
ip!("192.168.1.50"),
[0x52, 0x54, 0x00, 0xAA, 0xBB, 0x01],
),
make_host_binding(
"node2",
ip!("192.168.1.51"),
[0x52, 0x54, 0x00, 0xAA, 0xBB, 0x02],
),
],
None, // next_server
None, // boot_filename
None, // filename (BIOS)
None, // filename64 (EFI)
None, // filenameipxe
(ip!("192.168.1.100"), ip!("192.168.1.200")), // dhcp_range
Some("test.local".to_string()), // domain
);
// 3. TftpScore — install os-tftp, configure, serve a dummy file
let tftp_dir = std::env::temp_dir().join("harmony-tftp-test");
std::fs::create_dir_all(&tftp_dir)?;
std::fs::write(tftp_dir.join("test.txt"), "harmony integration test\n")?;
let tftp_score = TftpScore::new(Url::LocalFolder(tftp_dir.to_string_lossy().to_string()));
// 4. NodeExporterScore — install + enable Prometheus node exporter
let node_exporter_score = NodeExporterScore {};
// 5. VlanScore — create test VLANs on vtnet0
let vlan_score = VlanScore {
vlans: vec![
VlanDef {
parent_interface: "vtnet0".to_string(),
tag: 100,
description: "test-vlan-100".to_string(),
},
VlanDef {
parent_interface: "vtnet0".to_string(),
tag: 200,
description: "test-vlan-200".to_string(),
},
],
};
// 6. FirewallRuleScore — create test filter rules
let fw_rule_score = FirewallRuleScore {
rules: vec![FilterRuleDef {
action: FirewallAction::Pass,
direction: Direction::In,
interface: "lan".to_string(),
ip_protocol: IpProtocol::Inet,
protocol: NetworkProtocol::Tcp,
source_net: "any".to_string(),
destination_net: "any".to_string(),
destination_port: Some("8080".to_string()),
gateway: None,
description: "harmony-test-allow-8080".to_string(),
log: true,
}],
};
// 7. OutboundNatScore — create test SNAT rule
let snat_score = OutboundNatScore {
rules: vec![SnatRuleDef {
interface: "wan".to_string(),
ip_protocol: IpProtocol::Inet,
protocol: NetworkProtocol::Any,
source_net: "192.168.1.0/24".to_string(),
destination_net: "any".to_string(),
target: "wanip".to_string(),
description: "harmony-test-snat-lan".to_string(),
log: false,
nonat: false,
}],
};
// 8. BinatScore — create test 1:1 NAT rule
let binat_score = BinatScore {
rules: vec![BinatRuleDef {
interface: "wan".to_string(),
source_net: "192.168.1.50".to_string(),
external: "10.0.0.50".to_string(),
description: "harmony-test-binat".to_string(),
log: false,
}],
};
// 9. VipScore — IP alias on LAN
let vip_score = VipScore {
vips: vec![VipDef {
mode: VipMode::IpAlias,
interface: "lan".to_string(),
subnet: "192.168.1.250".to_string(),
subnet_bits: 32,
vhid: None,
advbase: None,
advskew: None,
password: None,
peer: None,
}],
};
// 10. DnatScore — port forward 8443 → 192.168.1.50:443
let dnat_score = DnatScore {
rules: vec![DnatRuleDef {
interface: "wan".to_string(),
ip_protocol: IpProtocol::Inet,
protocol: NetworkProtocol::Tcp,
destination: "wanip".to_string(),
destination_port: "8443".to_string(),
target: "192.168.1.50".to_string(),
local_port: Some("443".to_string()),
description: "harmony-test-dnat-8443".to_string(),
log: false,
register_rule: true,
}],
};
// 11. LaggScore — failover LAGG with vtnet2 + vtnet3
let lagg_score = LaggScore {
laggs: vec![LaggDef {
members: vec!["vtnet2".to_string(), "vtnet3".to_string()],
protocol: LaggProtocol::Failover,
description: "harmony-test-lagg".to_string(),
mtu: None,
lacp_fast_timeout: false,
}],
};
// ── Run all Scores ──────────────────────────────────────────────
info!("Running all Scores...");
let scores: Vec<Box<dyn Score<OPNSenseFirewall>>> = vec![
Box::new(lb_score),
Box::new(dhcp_score),
Box::new(tftp_score),
Box::new(node_exporter_score),
Box::new(vlan_score),
Box::new(fw_rule_score),
Box::new(snat_score),
Box::new(binat_score),
Box::new(vip_score),
Box::new(dnat_score),
Box::new(lagg_score),
];
let args = harmony_cli::Args {
yes: true,
filter: None,
interactive: false,
all: true,
number: 0,
list: false,
};
harmony_cli::run_cli(Inventory::autoload(), opnsense, scores, args).await?;
// ── Verify via API ──────────────────────────────────────────────
info!("Verifying all Scores via API...");
let client = opnsense_api::OpnsenseClient::builder()
.base_url(format!("https://{OPN_LAN_IP}:{OPN_API_PORT}/api"))
.auth_from_key_secret(&api_key, &api_secret)
.skip_tls_verify()
.timeout_secs(60)
.build()?;
// Verify HAProxy
let haproxy: serde_json::Value = client.get_typed("haproxy", "settings", "get").await?;
let frontends = haproxy["haproxy"]["frontends"]["frontend"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
info!(" HAProxy frontends: {frontends}");
assert!(
frontends >= 2,
"Expected at least 2 HAProxy frontends, got {frontends}"
);
// Verify DHCP (dnsmasq hosts)
let dnsmasq: serde_json::Value = client.get_typed("dnsmasq", "settings", "get").await?;
let hosts = dnsmasq["dnsmasq"]["hosts"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
info!(" Dnsmasq hosts: {hosts}");
assert!(hosts >= 2, "Expected at least 2 dnsmasq hosts, got {hosts}");
// Verify DHCP range
let ranges = dnsmasq["dnsmasq"]["dhcp_ranges"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
info!(" Dnsmasq DHCP ranges: {ranges}");
assert!(ranges >= 1, "Expected at least 1 DHCP range, got {ranges}");
// Verify TFTP
let tftp: serde_json::Value = client.get_typed("tftp", "general", "get").await?;
let tftp_enabled = tftp["general"]["enabled"].as_str() == Some("1");
info!(" TFTP enabled: {tftp_enabled}");
assert!(tftp_enabled, "TFTP should be enabled");
// Verify Node Exporter
let ne: serde_json::Value = client.get_typed("nodeexporter", "general", "get").await?;
let ne_enabled = ne["general"]["enabled"].as_str() == Some("1");
info!(" Node Exporter enabled: {ne_enabled}");
assert!(ne_enabled, "Node Exporter should be enabled");
// Verify VLANs
let vlans: serde_json::Value = client
.get_typed("interfaces", "vlan_settings", "get")
.await?;
let vlan_count = vlans["vlan"]["vlan"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
info!(" VLANs: {vlan_count}");
assert!(
vlan_count >= 2,
"Expected at least 2 VLANs, got {vlan_count}"
);
// Verify firewall rules (search endpoint returns rows)
let fw_rules: serde_json::Value = client.get_typed("firewall", "filter", "searchRule").await?;
let fw_count = fw_rules["rowCount"].as_i64().unwrap_or(0);
info!(" Firewall rules: {fw_count}");
assert!(
fw_count >= 1,
"Expected at least 1 firewall rule, got {fw_count}"
);
// Verify VIPs
let vips: serde_json::Value = client
.get_typed("interfaces", "vip_settings", "searchItem")
.await?;
let vip_count = vips["rowCount"].as_i64().unwrap_or(0);
info!(" VIPs: {vip_count}");
assert!(vip_count >= 1, "Expected at least 1 VIP, got {vip_count}");
// Verify DNat rules
let dnat_rules: serde_json::Value = client.get_typed("firewall", "d_nat", "searchRule").await?;
let dnat_count = dnat_rules["rowCount"].as_i64().unwrap_or(0);
info!(" DNat rules: {dnat_count}");
assert!(
dnat_count >= 1,
"Expected at least 1 DNat rule, got {dnat_count}"
);
// Verify LAGGs
let laggs: serde_json::Value = client
.get_typed("interfaces", "lagg_settings", "get")
.await?;
let lagg_count = laggs["lagg"]["lagg"]
.as_object()
.map(|m| m.len())
.unwrap_or(0);
info!(" LAGGs: {lagg_count}");
assert!(
lagg_count >= 1,
"Expected at least 1 LAGG, got {lagg_count}"
);
// Clean up temp files
let _ = std::fs::remove_dir_all(&tftp_dir);
println!();
println!("PASSED — All OPNsense integration tests successful:");
println!(" - LoadBalancerScore: {frontends} HAProxy frontends configured");
println!(" - DhcpScore: {hosts} static hosts, {ranges} DHCP range(s)");
println!(" - TftpScore: TFTP server enabled");
println!(" - NodeExporterScore: Node Exporter enabled");
println!(" - VlanScore: {vlan_count} VLANs configured");
println!(" - FirewallRuleScore: {fw_count} filter rules");
println!(" - OutboundNatScore: SNAT rule configured");
println!(" - BinatScore: 1:1 NAT rule configured");
println!(" - VipScore: {vip_count} VIPs configured");
println!(" - DnatScore: {dnat_count} DNat rules");
println!(" - LaggScore: {lagg_count} LAGGs configured");
println!();
println!("VM is running at {OPN_LAN_IP}. Use --clean to tear down.");
Ok(())
}
// ── Helpers ─────────────────────────────────────────────────────────────
fn print_setup() {
println!("Run the setup script for sudo-less libvirt access:");
println!(" ./examples/opnsense_vm_integration/setup-libvirt.sh");
println!();
println!("Verify with:");
println!(" cargo run -p opnsense-vm-integration -- --check");
}
fn check_prerequisites() -> Result<(), Box<dyn std::error::Error>> {
let mut ok = true;
let libvirtd = std::process::Command::new("systemctl")
.args(["is-active", "libvirtd"])
.output();
match libvirtd {
Ok(out) if out.status.success() => println!("[ok] libvirtd is running"),
_ => {
println!("[FAIL] libvirtd is not running");
ok = false;
}
}
let virsh = std::process::Command::new("virsh")
.args(["-c", "qemu:///system", "version"])
.output();
match virsh {
Ok(out) if out.status.success() => {
let v = String::from_utf8_lossy(&out.stdout);
println!("[ok] virsh connects: {}", v.lines().next().unwrap_or("?"));
}
_ => {
println!("[FAIL] Cannot connect to qemu:///system");
ok = false;
}
}
let pool = std::process::Command::new("virsh")
.args(["-c", "qemu:///system", "pool-info", "default"])
.output();
match pool {
Ok(out) if out.status.success() => println!("[ok] Default storage pool exists"),
_ => {
println!("[FAIL] Default storage pool not found");
ok = false;
}
}
if which("bunzip2") {
println!("[ok] bunzip2 available");
} else {
println!("[FAIL] bunzip2 not found");
ok = false;
}
if which("qemu-img") {
println!("[ok] qemu-img available");
} else {
println!("[FAIL] qemu-img not found");
ok = false;
}
// Check Docker + libvirt FORWARD conflict
if which("docker") {
let fw_backend = std::fs::read_to_string("/etc/libvirt/network.conf").unwrap_or_default();
if fw_backend
.lines()
.any(|l| l.trim().starts_with("firewall_backend") && l.contains("iptables"))
{
println!("[ok] libvirt uses iptables backend (Docker compatible)");
} else {
println!("[WARN] Docker detected but libvirt uses nftables backend");
println!(" VM NAT may not work. Run setup-libvirt.sh to fix.");
}
}
if !ok {
println!("\nRun --setup for setup instructions.");
return Err("Prerequisites not met".into());
}
println!("\nAll prerequisites met.");
Ok(())
}
fn which(cmd: &str) -> bool {
std::process::Command::new("which")
.arg(cmd)
.output()
.map(|o| o.status.success())
.unwrap_or(false)
}
fn run_cmd(cmd: &str, args: &[&str]) -> Result<(), Box<dyn std::error::Error>> {
let status = std::process::Command::new(cmd).args(args).status()?;
if !status.success() {
return Err(format!("{cmd} failed").into());
}
Ok(())
}
fn image_dir() -> PathBuf {
let dir = std::env::var("HARMONY_KVM_IMAGE_DIR").unwrap_or_else(|_| {
dirs::data_dir()
.unwrap_or_else(|| PathBuf::from("/tmp"))
.join("harmony")
.join("kvm")
.join("images")
.to_string_lossy()
.to_string()
});
PathBuf::from(dir)
}
/// FIXME this should be using the harmony-asset crate
async fn download_image() -> Result<PathBuf, Box<dyn std::error::Error>> {
let dir = image_dir();
std::fs::create_dir_all(&dir)?;
let img_path = dir.join(OPNSENSE_IMG_NAME);
if img_path.exists() {
info!("Image cached: {}", img_path.display());
return Ok(img_path);
}
let bz2_path = dir.join(format!("{OPNSENSE_IMG_NAME}.bz2"));
if !bz2_path.exists() {
info!("Downloading OPNsense nano image (~350MB)...");
let response = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(600))
.build()?
.get(OPNSENSE_IMG_URL)
.send()
.await?;
if !response.status().is_success() {
return Err(format!("Download failed: HTTP {}", response.status()).into());
}
let bytes = response.bytes().await?;
std::fs::write(&bz2_path, &bytes)?;
info!("Downloaded {} bytes", bytes.len());
}
info!("Decompressing...");
run_cmd("bunzip2", &["--keep", &bz2_path.to_string_lossy()])?;
info!("Image ready: {}", img_path.display());
Ok(img_path)
}
async fn clean(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
info!("Cleaning up...");
let _ = executor.destroy_vm(VM_NAME).await;
let _ = executor.undefine_vm(VM_NAME).await;
let _ = executor.delete_network(NET_NAME).await;
for ext in ["img", "qcow2"] {
let path = image_dir().join(format!("{VM_NAME}-boot.{ext}"));
if path.exists() {
std::fs::remove_file(&path)?;
info!("Removed: {}", path.display());
}
}
info!("Done. (Original image cached at {})", image_dir().display());
Ok(())
}
async fn status(executor: &KvmExecutor) -> Result<(), Box<dyn std::error::Error>> {
match executor.vm_status(VM_NAME).await {
Ok(s) => {
println!("{VM_NAME}: {s:?}");
if let Ok(Some(ip)) = executor.vm_ip(VM_NAME).await {
println!(" WAN IP: {ip}");
}
println!(" LAN IP: {OPN_LAN_IP} (static)");
let https_default = check_tcp_port(OPN_LAN_IP, 443).await;
let https_custom = check_tcp_port(OPN_LAN_IP, OPN_API_PORT).await;
let ssh = check_tcp_port(OPN_LAN_IP, 22).await;
if https_custom {
println!(" API: responding on port {OPN_API_PORT}");
} else if https_default {
println!(" API: responding on port 443 (change to {OPN_API_PORT} in web UI)");
} else {
println!(" API: not responding");
}
println!(
" SSH: {}",
if ssh { "responding" } else { "not responding" }
);
}
Err(_) => println!("{VM_NAME}: not found (run --boot first)"),
}
Ok(())
}
async fn wait_for_https(ip: &str, port: u16) -> Result<(), Box<dyn std::error::Error>> {
let client = reqwest::Client::builder()
.danger_accept_invalid_certs(true)
.timeout(std::time::Duration::from_secs(5))
.build()?;
let url = format!("https://{ip}:{port}");
for i in 0..60 {
if client.get(&url).send().await.is_ok() {
info!("Web UI responding (attempt {i})");
return Ok(());
}
if i % 10 == 0 {
info!("Waiting for OPNsense... (attempt {i})");
}
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
}
Err("OPNsense web UI did not respond within 5 minutes".into())
}
async fn check_tcp_port(ip: &str, port: u16) -> bool {
tokio::time::timeout(
std::time::Duration::from_secs(3),
tokio::net::TcpStream::connect(format!("{ip}:{port}")),
)
.await
.map(|r| r.is_ok())
.unwrap_or(false)
}
/// Build a HostBinding from a name, IP, and MAC bytes for use with DhcpScore.
fn make_host_binding(name: &str, ip: IpAddr, mac: [u8; 6]) -> HostBinding {
let logical = LogicalHost {
ip,
name: name.to_string(),
};
let physical = PhysicalHost {
id: Id::from(name.to_string()),
category: HostCategory::Server,
network: vec![NetworkInterface {
name: "eth0".to_string(),
mac_address: MacAddress(mac),
speed_mbps: None,
is_up: true,
mtu: 1500,
ipv4_addresses: vec![ip.to_string()],
ipv6_addresses: vec![],
driver: String::new(),
firmware_version: None,
}],
storage: vec![],
labels: vec![],
memory_modules: vec![],
cpus: vec![],
};
HostBinding::new(logical, physical, HostConfig::new(None))
}
async fn create_api_key_ssh(ip: &IpAddr) -> Result<(String, String), Box<dyn std::error::Error>> {
use opnsense_config::config::{OPNsenseShell, SshCredentials, SshOPNSenseShell};
let ssh_config = Arc::new(russh::client::Config {
inactivity_timeout: None,
..<_>::default()
});
let credentials = SshCredentials::Password {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let shell = SshOPNSenseShell::new((*ip, 22), credentials, ssh_config);
let php_script = r#"<?php
require_once '/usr/local/etc/inc/config.inc';
$key = bin2hex(random_bytes(20));
$secret = bin2hex(random_bytes(40));
$config = OPNsense\Core\Config::getInstance();
foreach ($config->object()->system->user as $user) {
if ((string)$user->name === 'root') {
if (!isset($user->apikeys)) { $user->addChild('apikeys'); }
$item = $user->apikeys->addChild('item');
$item->addChild('key', $key);
$item->addChild('secret', crypt($secret, '$6$' . bin2hex(random_bytes(8)) . '$'));
$config->save();
echo $key . "\n" . $secret . "\n";
exit(0);
}
}
echo "ERROR: root user not found\n";
exit(1);
"#;
info!("Writing API key script...");
shell
.write_content_to_file(php_script, "/tmp/create_api_key.php")
.await?;
info!("Executing API key generation...");
let output = shell
.exec("php /tmp/create_api_key.php && rm /tmp/create_api_key.php")
.await?;
let lines: Vec<&str> = output.trim().lines().collect();
if lines.len() >= 2 && !lines[0].starts_with("ERROR") {
Ok((lines[0].to_string(), lines[1].to_string()))
} else {
Err(format!("API key creation failed: {output}").into())
}
}

View File

@@ -32,17 +32,21 @@ pub async fn get_topology() -> HAClusterTopology {
let switch_client = Arc::new(switch_client);
let config = SecretManager::get_or_prompt::<OPNSenseFirewallConfig>().await;
let config = config.unwrap();
let config = SecretManager::get_or_prompt::<OPNSenseFirewallConfig>()
.await
.unwrap();
let api_creds = harmony::config::secret::OPNSenseApiCredentials {
key: config.username.clone(),
secret: config.password.clone(),
};
let ssh_creds = harmony::config::secret::OPNSenseFirewallCredentials {
username: config.username,
password: config.password,
};
let opnsense = Arc::new(
harmony::infra::opnsense::OPNSenseFirewall::new(
firewall,
None,
&config.username,
&config.password,
)
.await,
harmony::infra::opnsense::OPNSenseFirewall::new(firewall, None, &api_creds, &ssh_creds)
.await,
);
let lan_subnet = ipv4!("192.168.40.0");
let gateway_ipv4 = ipv4!("192.168.40.1");

View File

@@ -69,5 +69,6 @@ fn build_large_score() -> LoadBalancerScore {
lb_service.clone(),
lb_service.clone(),
],
wan_firewall_ports: vec![],
}
}

View File

@@ -7,6 +7,7 @@ async fn main() {
let zitadel = ZitadelScore {
host: "sso.sto1.nationtech.io".to_string(),
zitadel_version: "v4.12.1".to_string(),
external_secure: true,
};
harmony_cli::run(

View File

@@ -4,7 +4,7 @@ use kube::config::{KubeConfigOptions, Kubeconfig};
use kube::{Client, Config, Discovery, Error};
use log::error;
use serde::Serialize;
use tokio::sync::OnceCell;
use tokio::sync::{OnceCell, RwLock};
use crate::types::KubernetesDistribution;
@@ -23,7 +23,9 @@ pub struct K8sClient {
/// to stdout instead. Initialised from the `DRY_RUN` environment variable.
pub(crate) dry_run: bool,
pub(crate) k8s_distribution: Arc<OnceCell<KubernetesDistribution>>,
pub(crate) discovery: Arc<OnceCell<Discovery>>,
/// API discovery cache. Wrapped in `RwLock` so it can be invalidated
/// after installing CRDs or operators that register new API groups.
pub(crate) discovery: Arc<RwLock<Option<Arc<Discovery>>>>,
}
impl Serialize for K8sClient {
@@ -52,7 +54,7 @@ impl K8sClient {
dry_run: read_dry_run_from_env(),
client,
k8s_distribution: Arc::new(OnceCell::new()),
discovery: Arc::new(OnceCell::new()),
discovery: Arc::new(RwLock::new(None)),
}
}

View File

@@ -1,3 +1,4 @@
use std::sync::Arc;
use std::time::Duration;
use kube::{Discovery, Error};
@@ -15,38 +16,55 @@ impl K8sClient {
self.client.clone().apiserver_version().await
}
/// Runs (and caches) Kubernetes API discovery with exponential-backoff retries.
pub async fn discovery(&self) -> Result<&Discovery, Error> {
/// Runs API discovery, caching the result. Call [`invalidate_discovery`]
/// after installing CRDs or operators to force a refresh on the next call.
pub async fn discovery(&self) -> Result<Arc<Discovery>, Error> {
// Fast path: return cached discovery
{
let guard = self.discovery.read().await;
if let Some(d) = guard.as_ref() {
return Ok(Arc::clone(d));
}
}
// Slow path: run discovery with retries
let retry_strategy = ExponentialBackoff::from_millis(1000)
.max_delay(Duration::from_secs(32))
.take(6);
let attempt = Mutex::new(0u32);
Retry::spawn(retry_strategy, || async {
let d = Retry::spawn(retry_strategy, || async {
let mut n = attempt.lock().await;
*n += 1;
match self
.discovery
.get_or_try_init(async || {
debug!("Running Kubernetes API discovery (attempt {})", *n);
let d = Discovery::new(self.client.clone()).run().await?;
debug!("Kubernetes API discovery completed");
Ok(d)
})
debug!("Running Kubernetes API discovery (attempt {})", *n);
Discovery::new(self.client.clone())
.run()
.await
{
Ok(d) => Ok(d),
Err(e) => {
.map_err(|e| {
warn!("Kubernetes API discovery failed (attempt {}): {}", *n, e);
Err(e)
}
}
e
})
})
.await
.map_err(|e| {
error!("Kubernetes API discovery failed after all retries: {}", e);
e
})
})?;
debug!("Kubernetes API discovery completed");
let d = Arc::new(d);
let mut guard = self.discovery.write().await;
*guard = Some(Arc::clone(&d));
Ok(d)
}
/// Clears the cached API discovery so the next call to [`discovery`]
/// re-fetches from the API server. Call this after installing CRDs or
/// operators that register new API groups.
pub async fn invalidate_discovery(&self) {
let mut guard = self.discovery.write().await;
*guard = None;
debug!("API discovery cache invalidated");
}
/// Detect which Kubernetes distribution is running. Result is cached for

View File

@@ -6,8 +6,10 @@ pub mod discovery;
pub mod helper;
pub mod node;
pub mod pod;
pub mod port_forward;
pub mod resources;
pub mod types;
pub use client::K8sClient;
pub use port_forward::PortForwardHandle;
pub use types::{DrainOptions, KubernetesDistribution, NodeFile, ScopeResolver, WriteMode};

View File

@@ -190,4 +190,77 @@ impl K8sClient {
}
}
}
/// Execute a command in a specific pod by name, capturing stdout.
///
/// Returns the captured stdout on success. On failure, the error string
/// includes stderr output from the remote command.
pub async fn exec_pod_capture_output(
&self,
pod_name: &str,
namespace: Option<&str>,
command: Vec<&str>,
) -> Result<String, String> {
let api: Api<Pod> = match namespace {
Some(ns) => Api::namespaced(self.client.clone(), ns),
None => Api::default_namespaced(self.client.clone()),
};
match api
.exec(
pod_name,
command,
&AttachParams::default().stdout(true).stderr(true),
)
.await
{
Err(e) => Err(e.to_string()),
Ok(mut process) => {
let status = process
.take_status()
.expect("No status handle")
.await
.expect("Status channel closed");
let mut stdout_buf = String::new();
if let Some(mut stdout) = process.stdout() {
stdout
.read_to_string(&mut stdout_buf)
.await
.map_err(|e| format!("Failed to read stdout: {e}"))?;
}
let mut stderr_buf = String::new();
if let Some(mut stderr) = process.stderr() {
stderr
.read_to_string(&mut stderr_buf)
.await
.map_err(|e| format!("Failed to read stderr: {e}"))?;
}
if let Some(s) = status.status {
debug!("exec_pod status: {} - {:?}", s, status.details);
if s == "Success" {
Ok(stdout_buf)
} else {
Err(format!("{stderr_buf}"))
}
} else {
Err("No inner status from pod exec".to_string())
}
}
}
}
/// Execute a command in a specific pod by name (no output capture).
pub async fn exec_pod(
&self,
pod_name: &str,
namespace: Option<&str>,
command: Vec<&str>,
) -> Result<(), String> {
self.exec_pod_capture_output(pod_name, namespace, command)
.await
.map(|_| ())
}
}

View File

@@ -0,0 +1,133 @@
use std::net::SocketAddr;
use k8s_openapi::api::core::v1::Pod;
use kube::{Api, Error, error::DiscoveryError};
use log::{debug, error, info};
use tokio::net::TcpListener;
use crate::client::K8sClient;
/// Handle to a running port-forward. The forward is stopped when the handle is
/// dropped (or when [`abort`](Self::abort) is called explicitly).
pub struct PortForwardHandle {
local_addr: SocketAddr,
abort_handle: tokio::task::AbortHandle,
}
impl PortForwardHandle {
/// The local address the listener is bound to.
pub fn local_addr(&self) -> SocketAddr {
self.local_addr
}
/// The local port (convenience for `local_addr().port()`).
pub fn port(&self) -> u16 {
self.local_addr.port()
}
/// Stop the port-forward and close the listener.
pub fn abort(&self) {
self.abort_handle.abort();
}
}
impl Drop for PortForwardHandle {
fn drop(&mut self) {
self.abort_handle.abort();
}
}
impl K8sClient {
/// Forward a pod port to a local TCP listener.
///
/// Binds `127.0.0.1:{local_port}` (pass 0 to let the OS pick a free port)
/// and proxies every incoming TCP connection to the pod's `remote_port`
/// through the Kubernetes API server's portforward subresource (WebSocket).
///
/// Returns a [`PortForwardHandle`] whose [`port()`](PortForwardHandle::port)
/// gives the actual bound port. The forward runs in a background task and
/// is automatically stopped when the handle is dropped.
pub async fn port_forward(
&self,
pod_name: &str,
namespace: &str,
local_port: u16,
remote_port: u16,
) -> Result<PortForwardHandle, Error> {
let listener = TcpListener::bind(SocketAddr::from(([127, 0, 0, 1], local_port)))
.await
.map_err(|e| {
Error::Discovery(DiscoveryError::MissingResource(format!(
"Failed to bind 127.0.0.1:{local_port}: {e}"
)))
})?;
let local_addr = listener.local_addr().map_err(|e| {
Error::Discovery(DiscoveryError::MissingResource(format!(
"Failed to get local address: {e}"
)))
})?;
info!(
"Port-forward {} -> {}/{}:{}",
local_addr, namespace, pod_name, remote_port
);
let client = self.client.clone();
let ns = namespace.to_string();
let pod = pod_name.to_string();
let task = tokio::spawn(async move {
let api: Api<Pod> = Api::namespaced(client, &ns);
loop {
let (mut tcp_stream, peer) = match listener.accept().await {
Ok(conn) => conn,
Err(e) => {
debug!("Port-forward listener accept error: {e}");
break;
}
};
debug!("Port-forward connection from {peer}");
let api = api.clone();
let pod = pod.clone();
tokio::spawn(async move {
let mut pf = match api.portforward(&pod, &[remote_port]).await {
Ok(pf) => pf,
Err(e) => {
error!("Port-forward WebSocket setup failed: {e}");
return;
}
};
let mut kube_stream = match pf.take_stream(remote_port) {
Some(s) => s,
None => {
error!("Port-forward: no stream for port {remote_port}");
return;
}
};
match tokio::io::copy_bidirectional(&mut tcp_stream, &mut kube_stream).await {
Ok((from_client, from_pod)) => {
debug!(
"Port-forward connection closed ({from_client} bytes sent, {from_pod} bytes received)"
);
}
Err(e) => {
debug!("Port-forward copy error: {e}");
}
}
drop(pf);
});
}
});
Ok(PortForwardHandle {
local_addr,
abort_handle: task.abort_handle(),
})
}
}

View File

@@ -151,6 +151,28 @@ impl K8sClient {
Ok(!crds.items.is_empty())
}
/// Polls until a CRD is registered in the API server.
pub async fn wait_for_crd(&self, name: &str, timeout: Option<Duration>) -> Result<(), Error> {
let timeout = timeout.unwrap_or(Duration::from_secs(60));
let start = std::time::Instant::now();
let poll = Duration::from_secs(2);
loop {
if self.has_crd(name).await? {
return Ok(());
}
if start.elapsed() > timeout {
return Err(Error::Discovery(
kube::error::DiscoveryError::MissingResource(format!(
"CRD '{name}' not registered within {}s",
timeout.as_secs()
)),
));
}
tokio::time::sleep(poll).await;
}
}
pub async fn service_account_api(&self, namespace: &str) -> Api<ServiceAccount> {
Api::namespaced(self.client.clone(), namespace)
}
@@ -270,6 +292,23 @@ impl K8sClient {
api.get_opt(name).await
}
/// Deletes a single named resource. Returns `Ok(())` on success or if the
/// resource was already absent (idempotent).
pub async fn delete_resource<K>(&self, name: &str, namespace: Option<&str>) -> Result<(), Error>
where
K: Resource + Clone + std::fmt::Debug + DeserializeOwned,
<K as Resource>::Scope: ScopeResolver<K>,
<K as Resource>::DynamicType: Default,
{
let api: Api<K> =
<<K as Resource>::Scope as ScopeResolver<K>>::get_api(&self.client, namespace);
match api.delete(name, &kube::api::DeleteParams::default()).await {
Ok(_) => Ok(()),
Err(Error::Api(ErrorResponse { code: 404, .. })) => Ok(()),
Err(e) => Err(e),
}
}
pub async fn list_resources<K>(
&self,
namespace: Option<&str>,

View File

@@ -12,6 +12,7 @@ testing = []
hex = "0.4"
reqwest = { version = "0.11", features = [
"blocking",
"cookies",
"json",
"rustls-tls",
], default-features = false }
@@ -28,6 +29,7 @@ log.workspace = true
env_logger.workspace = true
async-trait.workspace = true
cidr.workspace = true
opnsense-api = { path = "../opnsense-api" }
opnsense-config = { path = "../opnsense-config" }
opnsense-config-xml = { path = "../opnsense-config-xml" }
harmony_macros = { path = "../harmony_macros" }
@@ -89,3 +91,4 @@ virt = "0.4.3"
[dev-dependencies]
pretty_assertions.workspace = true
assertor.workspace = true
httptest = "0.16"

View File

@@ -8,6 +8,12 @@ pub struct OPNSenseFirewallCredentials {
pub password: String,
}
#[derive(Secret, Serialize, Deserialize, JsonSchema, Debug, PartialEq)]
pub struct OPNSenseApiCredentials {
pub key: String,
pub secret: String,
}
// TODO we need a better way to handle multiple "instances" of the same secret structure.
#[derive(Secret, Serialize, Deserialize, JsonSchema, Debug, PartialEq)]
pub struct SshKeyPair {

View File

@@ -2,6 +2,7 @@ use harmony_types::id::Id;
use std::collections::BTreeMap;
use async_trait::async_trait;
use log::info;
use serde::Serialize;
use serde_value::Value;
@@ -12,6 +13,18 @@ use super::{
topology::Topology,
};
/// Format a duration in a human-readable way.
fn format_duration(d: std::time::Duration) -> String {
let secs = d.as_secs();
if secs < 60 {
format!("{:.1}s", d.as_secs_f64())
} else if secs < 3600 {
format!("{}m {}s", secs / 60, secs % 60)
} else {
format!("{}h {}m {}s", secs / 3600, (secs % 3600) / 60, secs % 60)
}
}
#[async_trait]
pub trait Score<T: Topology>:
std::fmt::Debug + ScoreToString<T> + Send + Sync + CloneBoxScore<T> + SerializeScore<T>
@@ -23,22 +36,47 @@ pub trait Score<T: Topology>:
) -> Result<Outcome, InterpretError> {
let id = Id::default();
let interpret = self.create_interpret();
let score_name = self.name();
let interpret_name = interpret.get_name().to_string();
instrumentation::instrument(HarmonyEvent::InterpretExecutionStarted {
execution_id: id.clone().to_string(),
topology: topology.name().into(),
interpret: interpret.get_name().to_string(),
score: self.name(),
message: format!("{} running...", interpret.get_name()),
interpret: interpret_name.clone(),
score: score_name.clone(),
message: format!("{} running...", interpret_name),
})
.unwrap();
let start = std::time::Instant::now();
let result = interpret.execute(inventory, topology).await;
let elapsed = start.elapsed();
match &result {
Ok(outcome) => {
info!(
"[{}] {} in {} — {}",
score_name,
outcome.status,
format_duration(elapsed),
outcome.message
);
}
Err(e) => {
info!(
"[{}] FAILED after {} — {}",
score_name,
format_duration(elapsed),
e
);
}
}
instrumentation::instrument(HarmonyEvent::InterpretExecutionFinished {
execution_id: id.clone().to_string(),
topology: topology.name().into(),
interpret: interpret.get_name().to_string(),
score: self.name(),
interpret: interpret_name,
score: score_name,
outcome: result.clone(),
})
.unwrap();

View File

@@ -0,0 +1,844 @@
//! Higher-order topology for managing an OPNsense firewall HA pair.
//!
//! Wraps a primary and backup `OPNSenseFirewall` instance. Most scores are
//! applied identically to both; CARP VIPs get differentiated advskew values
//! (primary=0, backup=configurable) to establish correct failover priority.
//!
//! See ROADMAP/10-firewall-pair-topology.md for future work (generic trait,
//! delegation macro, XMLRPC sync, integration tests).
//! See ROADMAP/11-named-config-instances.md for per-device credential support.
use std::net::IpAddr;
use std::str::FromStr;
use async_trait::async_trait;
use harmony_types::firewall::VipMode;
use harmony_types::id::Id;
use harmony_types::net::{IpAddress, MacAddress};
use log::info;
use serde::Serialize;
use crate::config::secret::{OPNSenseApiCredentials, OPNSenseFirewallCredentials};
use crate::data::Version;
use crate::executors::ExecutorError;
use crate::infra::opnsense::OPNSenseFirewall;
use crate::interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome};
use crate::inventory::Inventory;
use crate::modules::opnsense::dnat::DnatScore;
use crate::modules::opnsense::firewall::{BinatScore, FirewallRuleScore, OutboundNatScore};
use crate::modules::opnsense::lagg::LaggScore;
use crate::modules::opnsense::vip::VipDef;
use crate::modules::opnsense::vlan::VlanScore;
use crate::score::Score;
use crate::topology::{
DHCPStaticEntry, DhcpServer, LogicalHost, PreparationError, PreparationOutcome, PxeOptions,
Topology,
};
use harmony_secret::SecretManager;
// ── FirewallPairTopology ───────────────────────────────────────────
/// An OPNsense HA firewall pair managed via CARP.
///
/// Configuration is applied independently to both firewalls (not via XMLRPC
/// sync), since some settings like CARP advskew intentionally differ between
/// primary and backup.
#[derive(Debug, Clone)]
pub struct FirewallPairTopology {
pub primary: OPNSenseFirewall,
pub backup: OPNSenseFirewall,
}
impl FirewallPairTopology {
/// Construct a firewall pair from the harmony config system.
///
/// Reads the following environment variables:
/// - `OPNSENSE_PRIMARY_IP` — IP address of the primary firewall
/// - `OPNSENSE_BACKUP_IP` — IP address of the backup firewall
/// - `OPNSENSE_API_PORT` — API/web GUI port (default: 443)
///
/// Credentials are loaded via `SecretManager::get_or_prompt`.
pub async fn opnsense_from_config() -> Self {
// TODO: both firewalls share the same credentials. Once named config
// instances are available (ROADMAP/11), use per-device credentials:
// ConfigManager::get_named::<OPNSenseApiCredentials>("fw-primary")
let ssh_creds = SecretManager::get_or_prompt::<OPNSenseFirewallCredentials>()
.await
.expect("Failed to get SSH credentials");
let api_creds = SecretManager::get_or_prompt::<OPNSenseApiCredentials>()
.await
.expect("Failed to get API credentials");
Self::opnsense_with_credentials(&ssh_creds, &api_creds, &ssh_creds, &api_creds).await
}
pub async fn opnsense_with_credentials(
primary_ssh_creds: &OPNSenseFirewallCredentials,
primary_api_creds: &OPNSenseApiCredentials,
backup_ssh_creds: &OPNSenseFirewallCredentials,
backup_api_creds: &OPNSenseApiCredentials,
) -> Self {
let primary_ip =
std::env::var("OPNSENSE_PRIMARY_IP").expect("OPNSENSE_PRIMARY_IP must be set");
let backup_ip =
std::env::var("OPNSENSE_BACKUP_IP").expect("OPNSENSE_BACKUP_IP must be set");
let api_port: u16 = std::env::var("OPNSENSE_API_PORT")
.ok()
.map(|p| {
p.parse()
.expect("OPNSENSE_API_PORT must be a valid port number")
})
.unwrap_or(443);
let primary_host = LogicalHost {
ip: IpAddr::from_str(&primary_ip).expect("OPNSENSE_PRIMARY_IP must be a valid IP"),
name: "fw-primary".to_string(),
};
let backup_host = LogicalHost {
ip: IpAddr::from_str(&backup_ip).expect("OPNSENSE_BACKUP_IP must be a valid IP"),
name: "fw-backup".to_string(),
};
info!("Connecting to primary firewall at {primary_ip}:{api_port}");
let primary = OPNSenseFirewall::with_api_port(
primary_host,
None,
api_port,
&primary_api_creds,
&primary_ssh_creds,
)
.await;
info!("Connecting to backup firewall at {backup_ip}:{api_port}");
let backup = OPNSenseFirewall::with_api_port(
backup_host,
None,
api_port,
&backup_api_creds,
&backup_ssh_creds,
)
.await;
Self { primary, backup }
}
}
#[async_trait]
impl Topology for FirewallPairTopology {
fn name(&self) -> &str {
"FirewallPairTopology"
}
async fn ensure_ready(&self) -> Result<PreparationOutcome, PreparationError> {
let primary_outcome = self.primary.ensure_ready().await?;
let backup_outcome = self.backup.ensure_ready().await?;
match (primary_outcome, backup_outcome) {
(PreparationOutcome::Noop, PreparationOutcome::Noop) => Ok(PreparationOutcome::Noop),
(p, b) => {
let mut details = Vec::new();
if let PreparationOutcome::Success { details: d } = p {
details.push(format!("Primary: {}", d));
}
if let PreparationOutcome::Success { details: d } = b {
details.push(format!("Backup: {}", d));
}
Ok(PreparationOutcome::Success {
details: details.join(", "),
})
}
}
}
}
// ── DhcpServer delegation ──────────────────────────────────────────
//
// Required so that DhcpScore (which uses `impl<T: Topology + DhcpServer> Score<T>`)
// automatically works with FirewallPairTopology.
#[async_trait]
impl DhcpServer for FirewallPairTopology {
async fn commit_config(&self) -> Result<(), ExecutorError> {
self.primary.commit_config().await?;
self.backup.commit_config().await
}
async fn add_static_mapping(&self, entry: &DHCPStaticEntry) -> Result<(), ExecutorError> {
self.primary.add_static_mapping(entry).await?;
self.backup.add_static_mapping(entry).await
}
async fn remove_static_mapping(&self, mac: &MacAddress) -> Result<(), ExecutorError> {
self.primary.remove_static_mapping(mac).await?;
self.backup.remove_static_mapping(mac).await
}
async fn list_static_mappings(&self) -> Vec<(MacAddress, IpAddress)> {
// Return primary's view — both should be identical
self.primary.list_static_mappings().await
}
/// Returns the primary firewall's IP. In a CARP setup, callers
/// typically want the CARP VIP instead — use the VIP address directly.
fn get_ip(&self) -> IpAddress {
self.primary.get_ip()
}
/// Returns the primary firewall's host. See `get_ip()` note.
fn get_host(&self) -> LogicalHost {
self.primary.get_host()
}
async fn set_pxe_options(&self, options: PxeOptions) -> Result<(), ExecutorError> {
// PXE options are the same on both; construct a second copy for backup
let backup_options = PxeOptions {
ipxe_filename: options.ipxe_filename.clone(),
bios_filename: options.bios_filename.clone(),
efi_filename: options.efi_filename.clone(),
tftp_ip: options.tftp_ip,
};
self.primary.set_pxe_options(options).await?;
self.backup.set_pxe_options(backup_options).await
}
async fn set_dhcp_range(
&self,
start: &IpAddress,
end: &IpAddress,
) -> Result<(), ExecutorError> {
self.primary.set_dhcp_range(start, end).await?;
self.backup.set_dhcp_range(start, end).await
}
}
// ── Helper for uniform score delegation ────────────────────────────
/// Standard boilerplate for Interpret methods on pair scores.
macro_rules! pair_interpret_boilerplate {
($name:expr) => {
fn get_name(&self) -> InterpretName {
InterpretName::Custom($name)
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
};
}
// ── LaggScore for FirewallPairTopology ──────────────────────────────
impl Score<FirewallPairTopology> for LaggScore {
fn name(&self) -> String {
"LaggScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(LaggPairInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct LaggPairInterpret {
score: LaggScore,
}
#[async_trait]
impl Interpret<FirewallPairTopology> for LaggPairInterpret {
async fn execute(
&self,
inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let inner = self.score.create_interpret();
info!("Applying LaggScore to primary firewall");
inner.execute(inventory, &topology.primary).await?;
info!("Applying LaggScore to backup firewall");
inner.execute(inventory, &topology.backup).await
}
pair_interpret_boilerplate!("LaggScore (pair)");
}
// ── VlanScore for FirewallPairTopology ──────────────────────────────
impl Score<FirewallPairTopology> for VlanScore {
fn name(&self) -> String {
"VlanScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(VlanPairInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct VlanPairInterpret {
score: VlanScore,
}
#[async_trait]
impl Interpret<FirewallPairTopology> for VlanPairInterpret {
async fn execute(
&self,
inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let inner = self.score.create_interpret();
info!("Applying VlanScore to primary firewall");
inner.execute(inventory, &topology.primary).await?;
info!("Applying VlanScore to backup firewall");
inner.execute(inventory, &topology.backup).await
}
pair_interpret_boilerplate!("VlanScore (pair)");
}
// ── FirewallRuleScore for FirewallPairTopology ─────────────────────
impl Score<FirewallPairTopology> for FirewallRuleScore {
fn name(&self) -> String {
"FirewallRuleScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(FirewallRulePairInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct FirewallRulePairInterpret {
score: FirewallRuleScore,
}
#[async_trait]
impl Interpret<FirewallPairTopology> for FirewallRulePairInterpret {
async fn execute(
&self,
inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let inner = self.score.create_interpret();
info!("Applying FirewallRuleScore to primary firewall");
inner.execute(inventory, &topology.primary).await?;
info!("Applying FirewallRuleScore to backup firewall");
inner.execute(inventory, &topology.backup).await
}
pair_interpret_boilerplate!("FirewallRuleScore (pair)");
}
// ── BinatScore for FirewallPairTopology ────────────────────────────
impl Score<FirewallPairTopology> for BinatScore {
fn name(&self) -> String {
"BinatScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(BinatPairInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct BinatPairInterpret {
score: BinatScore,
}
#[async_trait]
impl Interpret<FirewallPairTopology> for BinatPairInterpret {
async fn execute(
&self,
inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let inner = self.score.create_interpret();
info!("Applying BinatScore to primary firewall");
inner.execute(inventory, &topology.primary).await?;
info!("Applying BinatScore to backup firewall");
inner.execute(inventory, &topology.backup).await
}
pair_interpret_boilerplate!("BinatScore (pair)");
}
// ── OutboundNatScore for FirewallPairTopology ──────────────────────
impl Score<FirewallPairTopology> for OutboundNatScore {
fn name(&self) -> String {
"OutboundNatScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(OutboundNatPairInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct OutboundNatPairInterpret {
score: OutboundNatScore,
}
#[async_trait]
impl Interpret<FirewallPairTopology> for OutboundNatPairInterpret {
async fn execute(
&self,
inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let inner = self.score.create_interpret();
info!("Applying OutboundNatScore to primary firewall");
inner.execute(inventory, &topology.primary).await?;
info!("Applying OutboundNatScore to backup firewall");
inner.execute(inventory, &topology.backup).await
}
pair_interpret_boilerplate!("OutboundNatScore (pair)");
}
// ── DnatScore for FirewallPairTopology ─────────────────────────────
impl Score<FirewallPairTopology> for DnatScore {
fn name(&self) -> String {
"DnatScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(DnatPairInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct DnatPairInterpret {
score: DnatScore,
}
#[async_trait]
impl Interpret<FirewallPairTopology> for DnatPairInterpret {
async fn execute(
&self,
inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let inner = self.score.create_interpret();
info!("Applying DnatScore to primary firewall");
inner.execute(inventory, &topology.primary).await?;
info!("Applying DnatScore to backup firewall");
inner.execute(inventory, &topology.backup).await
}
pair_interpret_boilerplate!("DnatScore (pair)");
}
// ── CarpVipScore ───────────────────────────────────────────────────
/// CARP-aware VIP score for firewall pairs.
///
/// Applies VIPs to both firewalls with differentiated CARP priority:
/// - Primary always gets `advskew=0` (highest priority, becomes CARP master)
/// - Backup gets `backup_advskew` (default 100, lower priority)
///
/// Non-CARP VIPs (IP alias, ProxyARP) are applied identically to both.
///
/// This is a distinct type from `VipScore` because the caller does not
/// specify advskew per-firewall — the pair semantics enforce it.
#[derive(Debug, Clone, Serialize)]
pub struct CarpVipScore {
pub vips: Vec<VipDef>,
/// advskew applied to backup firewall for CARP VIPs (default 100).
/// Primary always gets advskew=0.
pub backup_advskew: Option<u16>,
}
impl Score<FirewallPairTopology> for CarpVipScore {
fn name(&self) -> String {
"CarpVipScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<FirewallPairTopology>> {
Box::new(CarpVipInterpret {
score: self.clone(),
})
}
}
#[derive(Debug, Clone, Serialize)]
struct CarpVipInterpret {
score: CarpVipScore,
}
impl CarpVipInterpret {
async fn apply_vips_to(
&self,
firewall: &OPNSenseFirewall,
role: &str,
carp_advskew: u16,
) -> Result<(), InterpretError> {
let vip_config = firewall.get_opnsense_config().vip();
for vip in &self.score.vips {
let advskew = if vip.mode == VipMode::Carp {
Some(carp_advskew)
} else {
vip.advskew
};
info!(
"Ensuring VIP {} on {} {} (advskew={:?})",
vip.subnet, role, vip.interface, advskew
);
vip_config
.ensure_vip_from(
&vip.mode,
&vip.interface,
&vip.subnet,
vip.subnet_bits,
vip.vhid,
vip.advbase,
advskew,
vip.password.as_deref(),
vip.peer.as_deref(),
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Ok(())
}
}
#[async_trait]
impl Interpret<FirewallPairTopology> for CarpVipInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &FirewallPairTopology,
) -> Result<Outcome, InterpretError> {
let backup_skew = self.score.backup_advskew.unwrap_or(100);
self.apply_vips_to(&topology.primary, "primary", 0).await?;
self.apply_vips_to(&topology.backup, "backup", backup_skew)
.await?;
Ok(Outcome::success(format!(
"Configured {} VIPs on pair (primary advskew=0, backup advskew={})",
self.score.vips.len(),
backup_skew
)))
}
pair_interpret_boilerplate!("CarpVipScore");
}
#[cfg(test)]
mod tests {
use super::*;
use httptest::{Expectation, Server, matchers::request, responders::*};
use opnsense_api::OpnsenseClient;
use std::sync::Arc;
/// Dummy SSH shell for tests — never called, satisfies the `OPNsenseShell` trait.
#[derive(Debug)]
struct NoopShell;
#[async_trait]
impl opnsense_config::config::OPNsenseShell for NoopShell {
async fn exec(&self, _cmd: &str) -> Result<String, opnsense_config::Error> {
unimplemented!("test-only shell")
}
async fn write_content_to_temp_file(
&self,
_content: &str,
) -> Result<String, opnsense_config::Error> {
unimplemented!("test-only shell")
}
async fn write_content_to_file(
&self,
_content: &str,
_filename: &str,
) -> Result<String, opnsense_config::Error> {
unimplemented!("test-only shell")
}
async fn upload_folder(
&self,
_source: &str,
_destination: &str,
) -> Result<String, opnsense_config::Error> {
unimplemented!("test-only shell")
}
}
fn mock_opnsense_config(server: &Server) -> opnsense_config::Config {
let url = server.url("/api").to_string();
let client = OpnsenseClient::builder()
.base_url(url)
.auth_from_key_secret("test_key", "test_secret")
.build()
.unwrap();
let shell: Arc<dyn opnsense_config::config::OPNsenseShell> = Arc::new(NoopShell);
opnsense_config::Config::new(client, shell)
}
fn mock_firewall(server: &Server, name: &str) -> OPNSenseFirewall {
let host = LogicalHost {
ip: "127.0.0.1".parse().unwrap(),
name: name.to_string(),
};
OPNSenseFirewall::from_config(host, mock_opnsense_config(server))
}
fn mock_pair(primary_server: &Server, backup_server: &Server) -> FirewallPairTopology {
FirewallPairTopology {
primary: mock_firewall(primary_server, "fw-primary"),
backup: mock_firewall(backup_server, "fw-backup"),
}
}
fn vip_search_empty() -> serde_json::Value {
serde_json::json!({ "rows": [] })
}
fn vip_add_ok() -> serde_json::Value {
serde_json::json!({ "uuid": "new-uuid" })
}
fn vip_reconfigure_ok() -> serde_json::Value {
serde_json::json!({ "status": "ok" })
}
/// Set up a mock server to expect a VIP creation (search → add → reconfigure).
fn expect_vip_creation(server: &Server) {
server.expect(
Expectation::matching(request::method_path(
"GET",
"/api/interfaces/vip_settings/searchItem",
))
.respond_with(json_encoded(vip_search_empty())),
);
server.expect(
Expectation::matching(request::method_path(
"POST",
"/api/interfaces/vip_settings/addItem",
))
.respond_with(json_encoded(vip_add_ok())),
);
server.expect(
Expectation::matching(request::method_path(
"POST",
"/api/interfaces/vip_settings/reconfigure",
))
.respond_with(json_encoded(vip_reconfigure_ok())),
);
}
// ── ensure_ready tests ─────────────────────────────────────────
#[tokio::test]
async fn ensure_ready_merges_both_success() {
let s1 = Server::run();
let s2 = Server::run();
let pair = mock_pair(&s1, &s2);
let result = pair.ensure_ready().await.unwrap();
match result {
PreparationOutcome::Success { details } => {
assert!(details.contains("Primary"));
assert!(details.contains("Backup"));
}
PreparationOutcome::Noop => panic!("Expected Success, got Noop"),
}
}
// ── CarpVipScore tests ─────────────────────────────────────────
#[tokio::test]
async fn carp_vip_score_applies_to_both_firewalls() {
let primary_server = Server::run();
let backup_server = Server::run();
// Both firewalls should receive VIP creation calls
expect_vip_creation(&primary_server);
expect_vip_creation(&backup_server);
let pair = mock_pair(&primary_server, &backup_server);
let inventory = Inventory::empty();
let score = CarpVipScore {
vips: vec![VipDef {
mode: VipMode::Carp,
interface: "lan".to_string(),
subnet: "192.168.1.1".to_string(),
subnet_bits: 24,
vhid: Some(1),
advbase: Some(1),
advskew: None,
password: Some("secret".to_string()),
peer: None,
}],
backup_advskew: Some(100),
};
let result = score.interpret(&inventory, &pair).await;
assert!(result.is_ok(), "CarpVipScore should succeed: {:?}", result);
let outcome = result.unwrap();
assert!(
outcome.message.contains("primary advskew=0"),
"Message should mention primary advskew: {}",
outcome.message
);
assert!(
outcome.message.contains("backup advskew=100"),
"Message should mention backup advskew: {}",
outcome.message
);
}
#[tokio::test]
async fn carp_vip_score_sends_to_both_and_reports_advskew() {
let primary_server = Server::run();
let backup_server = Server::run();
// Both firewalls should receive VIP creation calls
expect_vip_creation(&primary_server);
expect_vip_creation(&backup_server);
let pair = mock_pair(&primary_server, &backup_server);
let inventory = Inventory::empty();
let score = CarpVipScore {
vips: vec![VipDef {
mode: VipMode::Carp,
interface: "lan".to_string(),
subnet: "10.0.0.1".to_string(),
subnet_bits: 32,
vhid: Some(1),
advbase: Some(1),
advskew: None,
password: Some("pass".to_string()),
peer: None,
}],
backup_advskew: Some(50),
};
let result = score.interpret(&inventory, &pair).await;
assert!(result.is_ok(), "CarpVipScore should succeed: {:?}", result);
let outcome = result.unwrap();
assert!(
outcome.message.contains("backup advskew=50"),
"Custom backup_advskew should be respected: {}",
outcome.message
);
// httptest verifies both servers received exactly the expected API calls
}
#[tokio::test]
async fn carp_vip_score_default_backup_advskew_is_100() {
let primary_server = Server::run();
let backup_server = Server::run();
expect_vip_creation(&primary_server);
expect_vip_creation(&backup_server);
let pair = mock_pair(&primary_server, &backup_server);
let inventory = Inventory::empty();
// backup_advskew is None — should default to 100
let score = CarpVipScore {
vips: vec![VipDef {
mode: VipMode::Carp,
interface: "lan".to_string(),
subnet: "10.0.0.1".to_string(),
subnet_bits: 32,
vhid: Some(1),
advbase: Some(1),
advskew: None,
password: None,
peer: None,
}],
backup_advskew: None,
};
let result = score.interpret(&inventory, &pair).await;
assert!(result.is_ok());
let outcome = result.unwrap();
assert!(
outcome.message.contains("backup advskew=100"),
"Default backup advskew should be 100: {}",
outcome.message
);
}
// ── Uniform score delegation tests ─────────────────────────────
#[tokio::test]
async fn vlan_score_applies_to_both_firewalls() {
let primary_server = Server::run();
let backup_server = Server::run();
// VLAN API: GET .../get to list, POST .../addItem to create, POST .../reconfigure to apply
fn expect_vlan_creation(server: &Server) {
server.expect(
Expectation::matching(request::method_path(
"GET",
"/api/interfaces/vlan_settings/get",
))
.respond_with(json_encoded(serde_json::json!({
"vlan": { "vlan": [] }
}))),
);
server.expect(
Expectation::matching(request::method_path(
"POST",
"/api/interfaces/vlan_settings/addItem",
))
.respond_with(json_encoded(serde_json::json!({ "uuid": "vlan-uuid" }))),
);
server.expect(
Expectation::matching(request::method_path(
"POST",
"/api/interfaces/vlan_settings/reconfigure",
))
.respond_with(json_encoded(serde_json::json!({ "status": "ok" }))),
);
}
expect_vlan_creation(&primary_server);
expect_vlan_creation(&backup_server);
let pair = mock_pair(&primary_server, &backup_server);
let inventory = Inventory::empty();
let score = VlanScore {
vlans: vec![crate::modules::opnsense::vlan::VlanDef {
parent_interface: "lagg0".to_string(),
tag: 50,
description: "test_vlan".to_string(),
}],
};
let result = score.interpret(&inventory, &pair).await;
assert!(result.is_ok(), "VlanScore should succeed: {:?}", result);
// httptest verifies both servers received the expected calls
}
}

View File

@@ -204,6 +204,9 @@ impl LoadBalancer for HAClusterTopology {
async fn reload_restart(&self) -> Result<(), ExecutorError> {
self.load_balancer.reload_restart().await
}
async fn ensure_wan_access(&self, port: u16) -> Result<(), ExecutorError> {
self.load_balancer.ensure_wan_access(port).await
}
}
#[async_trait]

View File

@@ -30,6 +30,18 @@ pub trait LoadBalancer: Send + Sync {
self.add_service(service).await?;
Ok(())
}
/// Ensure a TCP port is open for inbound WAN traffic.
///
/// This creates a firewall rule to accept traffic on the given port
/// from the WAN interface. Used by load balancers that need to receive
/// external traffic (e.g., OKD ingress on ports 80/443).
///
/// Default implementation is a no-op for topologies that don't manage
/// firewall rules (e.g., cloud environments with security groups).
async fn ensure_wan_access(&self, _port: u16) -> Result<(), ExecutorError> {
Ok(())
}
}
#[derive(Debug, PartialEq, Clone, Serialize)]

View File

@@ -1,10 +1,12 @@
pub mod decentralized;
mod failover;
pub mod firewall_pair;
mod ha_cluster;
pub mod ingress;
pub mod node_exporter;
pub mod opnsense;
pub use failover::*;
pub use firewall_pair::*;
use harmony_types::net::IpAddress;
mod host_binding;
mod http;

View File

@@ -1,6 +1,6 @@
use async_trait::async_trait;
use harmony_types::net::MacAddress;
use log::info;
use log::{info, warn};
use crate::{
executors::ExecutorError,
@@ -19,24 +19,46 @@ impl DhcpServer for OPNSenseFirewall {
async fn add_static_mapping(&self, entry: &DHCPStaticEntry) -> Result<(), ExecutorError> {
let mac: Vec<String> = entry.mac.iter().map(MacAddress::to_string).collect();
{
let mut writable_opnsense = self.opnsense_config.write().await;
writable_opnsense
.dhcp()
.add_static_mapping(&mac, &entry.ip, &entry.name)
.unwrap();
}
self.opnsense_config
.dhcp()
.add_static_mapping(&mac, &entry.ip, &entry.name)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
info!("Registered {:?}", entry);
Ok(())
}
async fn remove_static_mapping(&self, _mac: &MacAddress) -> Result<(), ExecutorError> {
todo!()
async fn remove_static_mapping(&self, mac: &MacAddress) -> Result<(), ExecutorError> {
self.opnsense_config
.dhcp()
.remove_static_mapping(&mac.to_string())
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
info!("Removed static mapping for MAC {}", mac);
Ok(())
}
async fn list_static_mappings(&self) -> Vec<(MacAddress, IpAddress)> {
todo!()
match self.opnsense_config.dhcp().list_static_mappings().await {
Ok(mappings) => mappings
.into_iter()
.filter_map(|(mac_str, ipv4)| {
let mac = MacAddress::try_from(mac_str.clone())
.map_err(|e| {
warn!("Skipping invalid MAC '{}': {}", mac_str, e);
e
})
.ok()?;
Some((mac, IpAddress::V4(ipv4)))
})
.collect(),
Err(e) => {
warn!("Failed to list static mappings: {}", e);
vec![]
}
}
}
fn get_ip(&self) -> IpAddress {
@@ -48,14 +70,13 @@ impl DhcpServer for OPNSenseFirewall {
}
async fn set_pxe_options(&self, options: PxeOptions) -> Result<(), ExecutorError> {
let mut writable_opnsense = self.opnsense_config.write().await;
let PxeOptions {
ipxe_filename,
bios_filename,
efi_filename,
tftp_ip,
} = options;
writable_opnsense
self.opnsense_config
.dhcp()
.set_pxe_options(
tftp_ip.map(|i| i.to_string()),
@@ -74,8 +95,7 @@ impl DhcpServer for OPNSenseFirewall {
start: &IpAddress,
end: &IpAddress,
) -> Result<(), ExecutorError> {
let mut writable_opnsense = self.opnsense_config.write().await;
writable_opnsense
self.opnsense_config
.dhcp()
.set_dhcp_range(&start.to_string(), &end.to_string())
.await

View File

@@ -11,22 +11,7 @@ use super::OPNSenseFirewall;
#[async_trait]
impl DnsServer for OPNSenseFirewall {
async fn register_hosts(&self, _hosts: Vec<DnsRecord>) -> Result<(), ExecutorError> {
todo!("Refactor this to use dnsmasq")
// let mut writable_opnsense = self.opnsense_config.write().await;
// let mut dns = writable_opnsense.dns();
// let hosts = hosts
// .iter()
// .map(|h| {
// Host::new(
// h.host.clone(),
// h.domain.clone(),
// h.record_type.to_string(),
// h.value.to_string(),
// )
// })
// .collect();
// dns.add_static_mapping(hosts);
// Ok(())
todo!("Refactor this to use dnsmasq API")
}
fn remove_record(
@@ -38,26 +23,7 @@ impl DnsServer for OPNSenseFirewall {
}
async fn list_records(&self) -> Vec<crate::topology::DnsRecord> {
todo!("Refactor this to use dnsmasq")
// self.opnsense_config
// .write()
// .await
// .dns()
// .get_hosts()
// .iter()
// .map(|h| DnsRecord {
// host: h.hostname.clone(),
// domain: h.domain.clone(),
// record_type: h
// .rr
// .parse()
// .expect("received invalid record type {h.rr} from opnsense"),
// value: h
// .server
// .parse()
// .expect("received invalid ipv4 record from opnsense {h.server}"),
// })
// .collect()
todo!("Refactor this to use dnsmasq API")
}
fn get_ip(&self) -> IpAddress {
@@ -69,23 +35,11 @@ impl DnsServer for OPNSenseFirewall {
}
async fn register_dhcp_leases(&self, _register: bool) -> Result<(), ExecutorError> {
todo!("Refactor this to use dnsmasq")
// let mut writable_opnsense = self.opnsense_config.write().await;
// let mut dns = writable_opnsense.dns();
// dns.register_dhcp_leases(register);
//
// Ok(())
todo!("Refactor this to use dnsmasq API")
}
async fn commit_config(&self) -> Result<(), ExecutorError> {
let opnsense = self.opnsense_config.read().await;
opnsense
.save()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
opnsense
self.opnsense_config
.restart_dns()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))

View File

@@ -8,6 +8,53 @@ use harmony_types::net::IpAddress;
use harmony_types::net::Url;
const OPNSENSE_HTTP_ROOT_PATH: &str = "/usr/local/http";
/// Download a remote URL into a temporary directory, returning the temp dir path.
///
///
/// The file is saved with its original filename (extracted from the URL path).
/// The caller can then use `upload_files` to SFTP the whole temp dir contents
/// to the OPNsense appliance.
pub(in crate::infra::opnsense) async fn download_url_to_temp_dir(
url: &url::Url,
) -> Result<String, ExecutorError> {
let client = reqwest::Client::new();
let response =
client.get(url.as_str()).send().await.map_err(|e| {
ExecutorError::UnexpectedError(format!("Failed to download {url}: {e}"))
})?;
if !response.status().is_success() {
return Err(ExecutorError::UnexpectedError(format!(
"HTTP {} downloading {url}",
response.status()
)));
}
let file_name = url
.path_segments()
.and_then(|s| s.last())
.filter(|s| !s.is_empty())
.unwrap_or("download");
let temp_dir = std::env::temp_dir().join("harmony_url_downloads");
tokio::fs::create_dir_all(&temp_dir)
.await
.map_err(|e| ExecutorError::UnexpectedError(format!("Failed to create temp dir: {e}")))?;
let dest = temp_dir.join(file_name);
let bytes = response
.bytes()
.await
.map_err(|e| ExecutorError::UnexpectedError(format!("Failed to read response: {e}")))?;
tokio::fs::write(&dest, &bytes)
.await
.map_err(|e| ExecutorError::UnexpectedError(format!("Failed to write temp file: {e}")))?;
info!("Downloaded {} to {:?} ({} bytes)", url, dest, bytes.len());
Ok(temp_dir.to_string_lossy().to_string())
}
#[async_trait]
impl HttpServer for OPNSenseFirewall {
async fn serve_files(
@@ -15,7 +62,6 @@ impl HttpServer for OPNSenseFirewall {
url: &Url,
remote_path: &Option<String>,
) -> Result<(), ExecutorError> {
let config = self.opnsense_config.read().await;
info!("Uploading files from url {url} to {OPNSENSE_HTTP_ROOT_PATH}");
let remote_upload_path = remote_path
.clone()
@@ -23,12 +69,18 @@ impl HttpServer for OPNSenseFirewall {
.unwrap_or(OPNSENSE_HTTP_ROOT_PATH.to_string());
match url {
Url::LocalFolder(path) => {
config
self.opnsense_config
.upload_files(path, &remote_upload_path)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Url::Url(_url) => todo!(),
Url::Url(remote_url) => {
let local_dir = download_url_to_temp_dir(remote_url).await?;
self.opnsense_config
.upload_files(&local_dir, &remote_upload_path)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
}
Ok(())
}
@@ -45,9 +97,8 @@ impl HttpServer for OPNSenseFirewall {
}
};
let config = self.opnsense_config.read().await;
info!("Uploading file content to {}", path);
config
self.opnsense_config
.upload_file_content(&path, &file.content)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
@@ -64,8 +115,6 @@ impl HttpServer for OPNSenseFirewall {
async fn reload_restart(&self) -> Result<(), ExecutorError> {
self.opnsense_config
.write()
.await
.caddy()
.reload_restart()
.await
@@ -73,20 +122,20 @@ impl HttpServer for OPNSenseFirewall {
}
async fn ensure_initialized(&self) -> Result<(), ExecutorError> {
let mut config = self.opnsense_config.write().await;
let caddy = config.caddy();
if caddy.get_full_config().is_none() {
info!("Http config not available in opnsense config, installing package");
config.install_package("os-caddy").await.map_err(|e| {
ExecutorError::UnexpectedError(format!(
"Executor failed when trying to install os-caddy package with error {e:?}"
))
})?;
if !self.opnsense_config.caddy().is_installed().await {
info!("Http config not available, installing os-caddy package");
self.opnsense_config
.install_package("os-caddy")
.await
.map_err(|e| {
ExecutorError::UnexpectedError(format!("Failed to install os-caddy: {e:?}"))
})?;
} else {
info!("Http config available in opnsense config, assuming it is already installed");
info!("Http config available, assuming Caddy is already installed");
}
info!("Adding custom caddy config files");
config
self.opnsense_config
.upload_files(
"./data/watchguard/caddy_config",
"/usr/local/etc/caddy/caddy.d/",
@@ -95,7 +144,11 @@ impl HttpServer for OPNSenseFirewall {
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
info!("Enabling http server");
config.caddy().enable(true);
self.opnsense_config
.caddy()
.enable(true)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(())
}

View File

@@ -1,9 +1,8 @@
use async_trait::async_trait;
use log::{debug, error, info, warn};
use opnsense_config_xml::{
Frontend, HAProxy, HAProxyBackend, HAProxyHealthCheck, HAProxyServer, MaybeString,
use opnsense_config::modules::load_balancer::{
HaproxyService, LbBackend, LbFrontend, LbHealthCheck, LbServer,
};
use uuid::Uuid;
use crate::{
executors::ExecutorError,
@@ -12,6 +11,7 @@ use crate::{
LogicalHost, SSL,
},
};
use harmony_types::firewall::{Direction, FirewallAction, IpProtocol, NetworkProtocol};
use harmony_types::net::IpAddress;
use super::OPNSenseFirewall;
@@ -26,15 +26,13 @@ impl LoadBalancer for OPNSenseFirewall {
}
async fn add_service(&self, service: &LoadBalancerService) -> Result<(), ExecutorError> {
let mut config = self.opnsense_config.write().await;
let mut load_balancer = config.load_balancer();
let (frontend, backend, servers, healthcheck) = harmony_service_to_lb_types(service);
let (frontend, backend, servers, healthcheck) =
harmony_load_balancer_service_to_haproxy_xml(service);
load_balancer.configure_service(frontend, backend, servers, healthcheck);
Ok(())
self.opnsense_config
.load_balancer()
.configure_service(frontend, backend, servers, healthcheck)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))
}
async fn remove_service(&self, service: &LoadBalancerService) -> Result<(), ExecutorError> {
@@ -47,8 +45,6 @@ impl LoadBalancer for OPNSenseFirewall {
async fn reload_restart(&self) -> Result<(), ExecutorError> {
self.opnsense_config
.write()
.await
.load_balancer()
.reload_restart()
.await
@@ -56,455 +52,214 @@ impl LoadBalancer for OPNSenseFirewall {
}
async fn ensure_initialized(&self) -> Result<(), ExecutorError> {
let mut config = self.opnsense_config.write().await;
let load_balancer = config.load_balancer();
if let Some(config) = load_balancer.get_full_config() {
debug!(
"HAProxy config available in opnsense config, assuming it is already installed, {config:?}"
);
let lb = self.opnsense_config.load_balancer();
if lb.is_installed().await {
debug!("HAProxy is installed");
} else {
config.install_package("os-haproxy").await.map_err(|e| {
ExecutorError::UnexpectedError(format!(
"Executor failed when trying to install os-haproxy package with error {e:?}"
))
})?;
self.opnsense_config
.install_package("os-haproxy")
.await
.map_err(|e| {
ExecutorError::UnexpectedError(format!("Failed to install os-haproxy: {e:?}"))
})?;
}
config.load_balancer().enable(true);
self.opnsense_config
.load_balancer()
.enable(true)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(())
}
async fn list_services(&self) -> Vec<LoadBalancerService> {
let mut config = self.opnsense_config.write().await;
let load_balancer = config.load_balancer();
let haproxy_xml_config = load_balancer.get_full_config();
haproxy_xml_config_to_harmony_loadbalancer(haproxy_xml_config)
match self.opnsense_config.load_balancer().list_services().await {
Ok(services) => services
.into_iter()
.filter_map(|svc| haproxy_service_to_harmony(&svc))
.collect(),
Err(e) => {
warn!("Failed to list HAProxy services: {e}");
vec![]
}
}
}
async fn ensure_wan_access(&self, port: u16) -> Result<(), ExecutorError> {
info!("Ensuring WAN firewall rule for TCP port {port}");
let fw = self.opnsense_config.firewall();
fw.ensure_filter_rule(
&FirewallAction::Pass,
&Direction::In,
"wan",
&IpProtocol::Inet,
&NetworkProtocol::Tcp,
"any",
"any",
Some(&port.to_string()),
None,
&format!("LB: Allow TCP/{port} ingress on WAN"),
false,
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
fw.apply()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(())
}
}
pub(crate) fn haproxy_xml_config_to_harmony_loadbalancer(
haproxy: &Option<HAProxy>,
) -> Vec<LoadBalancerService> {
let haproxy = match haproxy {
Some(haproxy) => haproxy,
None => return vec![],
};
fn haproxy_service_to_harmony(svc: &HaproxyService) -> Option<LoadBalancerService> {
let listening_port = svc.bind.parse().unwrap_or_else(|_| {
panic!(
"HAProxy frontend address should be a valid SocketAddr, got {}",
svc.bind
)
});
haproxy
.frontends
.frontend
.iter()
.map(|frontend| {
let mut backend_servers = vec![];
let matching_backend = haproxy
.backends
.backends
.iter()
.find(|b| Some(b.uuid.clone()) == frontend.default_backend);
let mut health_check = None;
match matching_backend {
Some(backend) => {
backend_servers.append(&mut get_servers_for_backend(backend, haproxy));
health_check = get_health_check_for_backend(backend, haproxy);
}
None => {
warn!(
"HAProxy config could not find a matching backend for frontend {frontend:?}"
);
}
}
LoadBalancerService {
backend_servers,
listening_port: frontend.bind.parse().unwrap_or_else(|_| {
panic!(
"HAProxy frontend address should be a valid SocketAddr, got {}",
frontend.bind
)
}),
health_check,
}
})
.collect()
}
pub(crate) fn get_servers_for_backend(
backend: &HAProxyBackend,
haproxy: &HAProxy,
) -> Vec<BackendServer> {
let backend_servers: Vec<&str> = match &backend.linked_servers.content {
Some(linked_servers) => linked_servers.split(',').collect(),
None => {
info!("No server defined for HAProxy backend {:?}", backend);
return vec![];
}
};
haproxy
.servers
let backend_servers: Vec<BackendServer> = svc
.servers
.iter()
.filter_map(|server| {
let address = server.address.clone()?;
let port = server.port?;
if backend_servers.contains(&server.uuid.as_str()) {
return Some(BackendServer { address, port });
}
None
.map(|s| BackendServer {
address: s.address.clone(),
port: s.port,
})
.collect()
}
.collect();
pub(crate) fn get_health_check_for_backend(
backend: &HAProxyBackend,
haproxy: &HAProxy,
) -> Option<HealthCheck> {
let health_check_uuid = match &backend.health_check.content {
Some(uuid) => uuid,
None => return None,
};
let haproxy_health_check = haproxy
.healthchecks
.healthchecks
.iter()
.find(|h| &h.uuid == health_check_uuid)?;
let binding = haproxy_health_check.health_check_type.to_uppercase();
let uppercase = binding.as_str();
match uppercase {
"TCP" => {
if let Some(checkport) = haproxy_health_check.checkport.content.as_ref() {
if !checkport.is_empty() {
return Some(HealthCheck::TCP(Some(checkport.parse().unwrap_or_else(
|_| {
panic!(
"HAProxy check port should be a valid port number, got {checkport}"
)
},
))));
}
}
Some(HealthCheck::TCP(None))
}
"HTTP" => {
let path: String = haproxy_health_check
.http_uri
.content
.clone()
.unwrap_or_default();
let method: HttpMethod = haproxy_health_check
.http_method
.content
.clone()
.unwrap_or_default()
.into();
let status_code: HttpStatusCode = HttpStatusCode::Success2xx;
let ssl = match haproxy_health_check
.ssl
.content_string()
.to_uppercase()
.as_str()
{
"SSL" => SSL::SSL,
"SSLNI" => SSL::SNI,
"NOSSL" => SSL::Disabled,
"" => SSL::Default,
other => {
error!("Unknown haproxy health check ssl config {other}");
SSL::Other(other.to_string())
}
};
let port = haproxy_health_check
.checkport
.content_string()
.parse::<u16>()
.ok();
debug!("Found haproxy healthcheck port {port:?}");
Some(HealthCheck::HTTP(port, path, method, status_code, ssl))
}
_ => panic!("Received unsupported health check type {}", uppercase),
}
}
pub(crate) fn harmony_load_balancer_service_to_haproxy_xml(
service: &LoadBalancerService,
) -> (
Frontend,
HAProxyBackend,
Vec<HAProxyServer>,
Option<HAProxyHealthCheck>,
) {
// Here we have to build :
// One frontend
// One backend
// One Option<healthcheck>
// Vec of servers
//
// Then merge then with haproxy config individually
//
// We also have to take into account that it is entirely possible that a backe uses a server
// with the same definition as in another backend. So when creating a new backend, we must not
// blindly create new servers because the backend does not exist yet. Even if it is a new
// backend, it may very well reuse existing servers
//
// Also we need to support router integration for port forwarding on WAN as a strategy to
// handle dyndns
// server is standalone
// backend points on server
// backend points to health check
// frontend points to backend
let healthcheck = if let Some(health_check) = &service.health_check {
match health_check {
HealthCheck::HTTP(port, path, http_method, _http_status_code, ssl) => {
let ssl: MaybeString = match ssl {
SSL::SSL => "ssl".into(),
SSL::SNI => "sslni".into(),
SSL::Disabled => "nossl".into(),
SSL::Default => "".into(),
SSL::Other(other) => other.as_str().into(),
let health_check = svc
.health_check
.as_ref()
.and_then(|hc| match hc.check_type.as_str() {
"TCP" => Some(HealthCheck::TCP(hc.checkport)),
"HTTP" => {
let path = hc.http_uri.clone().unwrap_or_default();
let method: HttpMethod = hc.http_method.clone().unwrap_or_default().into();
let ssl = match hc.ssl.as_deref().unwrap_or("").to_uppercase().as_str() {
"SSL" => SSL::SSL,
"SSLNI" => SSL::SNI,
"NOSSL" => SSL::Disabled,
"" => SSL::Default,
other => {
error!("Unknown haproxy health check ssl config {other}");
SSL::Other(other.to_string())
}
};
let path_without_query = path.split_once('?').map_or(path.as_str(), |(p, _)| p);
let (port, port_name) = match port {
Some(port) => (Some(port.to_string()), port.to_string()),
None => (None, "serverport".to_string()),
};
let haproxy_check = HAProxyHealthCheck {
name: format!("HTTP_{http_method}_{path_without_query}_{port_name}"),
uuid: Uuid::new_v4().to_string(),
http_method: http_method.to_string().to_lowercase().into(),
health_check_type: "http".to_string(),
http_uri: path.clone().into(),
interval: "2s".to_string(),
Some(HealthCheck::HTTP(
hc.checkport,
path,
method,
HttpStatusCode::Success2xx,
ssl,
checkport: MaybeString::from(port.map(|p| p.to_string())),
..Default::default()
};
Some(haproxy_check)
))
}
HealthCheck::TCP(port) => {
let (port, port_name) = match port {
Some(port) => (Some(port.to_string()), port.to_string()),
None => (None, "serverport".to_string()),
};
_ => {
warn!("Unsupported health check type: {}", hc.check_type);
None
}
});
let haproxy_check = HAProxyHealthCheck {
name: format!("TCP_{port_name}"),
uuid: Uuid::new_v4().to_string(),
health_check_type: "tcp".to_string(),
checkport: port.into(),
interval: "2s".to_string(),
..Default::default()
};
Some(LoadBalancerService {
backend_servers,
listening_port,
health_check,
})
}
Some(haproxy_check)
pub(crate) fn harmony_service_to_lb_types(
service: &LoadBalancerService,
) -> (LbFrontend, LbBackend, Vec<LbServer>, Option<LbHealthCheck>) {
let healthcheck = service.health_check.as_ref().map(|hc| match hc {
HealthCheck::HTTP(port, path, http_method, _status_code, ssl) => {
let ssl_str = match ssl {
SSL::SSL => Some("ssl".to_string()),
SSL::SNI => Some("sslni".to_string()),
SSL::Disabled => Some("nossl".to_string()),
SSL::Default => Some(String::new()),
SSL::Other(other) => Some(other.clone()),
};
let path_without_query = path.split_once('?').map_or(path.as_str(), |(p, _)| p);
let port_name = port
.map(|p| p.to_string())
.unwrap_or("serverport".to_string());
LbHealthCheck {
name: format!("HTTP_{http_method}_{path_without_query}_{port_name}"),
check_type: "http".to_string(),
interval: "2s".to_string(),
http_method: Some(http_method.to_string().to_lowercase()),
http_uri: Some(path.clone()),
ssl: ssl_str,
checkport: port.map(|p| p.to_string()),
}
}
} else {
None
};
debug!("Built healthcheck {healthcheck:?}");
HealthCheck::TCP(port) => {
let port_name = port
.map(|p| p.to_string())
.unwrap_or("serverport".to_string());
LbHealthCheck {
name: format!("TCP_{port_name}"),
check_type: "tcp".to_string(),
interval: "2s".to_string(),
http_method: None,
http_uri: None,
ssl: None,
checkport: port.map(|p| p.to_string()),
}
}
});
let servers: Vec<HAProxyServer> = service
let servers: Vec<LbServer> = service
.backend_servers
.iter()
.map(server_to_haproxy_server)
.map(|s| LbServer {
name: format!("{}_{}", &s.address, &s.port),
address: s.address.clone(),
port: s.port,
enabled: true,
mode: "active".to_string(),
server_type: "static".to_string(),
})
.collect();
debug!("Built servers {servers:?}");
let mut backend = HAProxyBackend {
uuid: Uuid::new_v4().to_string(),
enabled: 1,
name: format!(
"backend_{}",
service.listening_port.to_string().replace(':', "_")
),
let bind_str = service.listening_port.to_string();
let safe_name = bind_str.replace(':', "_");
let backend = LbBackend {
name: format!("backend_{safe_name}"),
mode: "tcp".to_string(),
algorithm: "roundrobin".to_string(),
enabled: true,
health_check_enabled: healthcheck.is_some(),
random_draws: Some(2),
stickiness_expire: "30m".to_string(),
stickiness_size: "50k".to_string(),
stickiness_conn_rate_period: "10s".to_string(),
stickiness_sess_rate_period: "10s".to_string(),
stickiness_http_req_rate_period: "10s".to_string(),
stickiness_http_err_rate_period: "10s".to_string(),
stickiness_bytes_in_rate_period: "1m".to_string(),
stickiness_bytes_out_rate_period: "1m".to_string(),
mode: "tcp".to_string(), // TODO do not depend on health check here
..Default::default()
stickiness_expire: Some("30m".to_string()),
stickiness_size: Some("50k".to_string()),
stickiness_conn_rate_period: Some("10s".to_string()),
stickiness_sess_rate_period: Some("10s".to_string()),
stickiness_http_req_rate_period: Some("10s".to_string()),
stickiness_http_err_rate_period: Some("10s".to_string()),
stickiness_bytes_in_rate_period: Some("1m".to_string()),
stickiness_bytes_out_rate_period: Some("1m".to_string()),
};
info!("HAPRoxy backend algorithm is currently hardcoded to roundrobin");
info!("HAProxy backend algorithm is currently hardcoded to roundrobin");
if let Some(hcheck) = &healthcheck {
backend.health_check_enabled = 1;
backend.health_check = hcheck.uuid.clone().into();
}
backend.linked_servers = servers
.iter()
.map(|s| s.uuid.as_str())
.collect::<Vec<&str>>()
.join(",")
.into();
debug!("Built backend {backend:?}");
let frontend = Frontend {
uuid: uuid::Uuid::new_v4().to_string(),
enabled: 1,
name: format!(
"frontend_{}",
service.listening_port.to_string().replace(':', "_")
),
bind: service.listening_port.to_string(),
mode: "tcp".to_string(), // TODO do not depend on health check here
default_backend: Some(backend.uuid.clone()),
stickiness_expire: "30m".to_string().into(),
stickiness_size: "50k".to_string().into(),
stickiness_conn_rate_period: "10s".to_string().into(),
stickiness_sess_rate_period: "10s".to_string().into(),
stickiness_http_req_rate_period: "10s".to_string().into(),
stickiness_http_err_rate_period: "10s".to_string().into(),
stickiness_bytes_in_rate_period: "1m".to_string().into(),
stickiness_bytes_out_rate_period: "1m".to_string().into(),
ssl_hsts_max_age: 15768000,
..Default::default()
let frontend = LbFrontend {
name: format!("frontend_{safe_name}"),
bind: bind_str,
mode: "tcp".to_string(),
enabled: true,
default_backend: None, // Set by configure_service after creating backend
stickiness_expire: Some("30m".to_string()),
stickiness_size: Some("50k".to_string()),
stickiness_conn_rate_period: Some("10s".to_string()),
stickiness_sess_rate_period: Some("10s".to_string()),
stickiness_http_req_rate_period: Some("10s".to_string()),
stickiness_http_err_rate_period: Some("10s".to_string()),
stickiness_bytes_in_rate_period: Some("1m".to_string()),
stickiness_bytes_out_rate_period: Some("1m".to_string()),
ssl_hsts_max_age: Some(15768000),
};
info!("HAPRoxy frontend and backend mode currently hardcoded to tcp");
info!("HAProxy frontend and backend mode currently hardcoded to tcp");
debug!("Built frontend {frontend:?}");
(frontend, backend, servers, healthcheck)
}
fn server_to_haproxy_server(server: &BackendServer) -> HAProxyServer {
HAProxyServer {
uuid: Uuid::new_v4().to_string(),
name: format!("{}_{}", &server.address, &server.port),
enabled: 1,
address: Some(server.address.clone()),
port: Some(server.port),
mode: "active".to_string(),
server_type: "static".to_string(),
..Default::default()
}
}
#[cfg(test)]
mod tests {
use opnsense_config_xml::HAProxyServer;
use super::*;
#[test]
fn test_get_servers_for_backend_with_linked_servers() {
// Create a backend with linked servers
let mut backend = HAProxyBackend::default();
backend.linked_servers.content = Some("server1,server2".to_string());
// Create an HAProxy instance with servers
let mut haproxy = HAProxy::default();
let server = HAProxyServer {
uuid: "server1".to_string(),
address: Some("192.168.1.1".to_string()),
port: Some(80),
..Default::default()
};
haproxy.servers.servers.push(server);
// Call the function
let result = get_servers_for_backend(&backend, &haproxy);
// Check the result
assert_eq!(
result,
vec![BackendServer {
address: "192.168.1.1".to_string(),
port: 80,
},]
);
}
#[test]
fn test_get_servers_for_backend_no_linked_servers() {
// Create a backend with no linked servers
let backend = HAProxyBackend::default();
// Create an HAProxy instance with servers
let mut haproxy = HAProxy::default();
let server = HAProxyServer {
uuid: "server1".to_string(),
address: Some("192.168.1.1".to_string()),
port: Some(80),
..Default::default()
};
haproxy.servers.servers.push(server);
// Call the function
let result = get_servers_for_backend(&backend, &haproxy);
// Check the result
assert_eq!(result, vec![]);
}
#[test]
fn test_get_servers_for_backend_no_matching_servers() {
// Create a backend with linked servers that do not match any in HAProxy
let mut backend = HAProxyBackend::default();
backend.linked_servers.content = Some("server4,server5".to_string());
// Create an HAProxy instance with servers
let mut haproxy = HAProxy::default();
let server = HAProxyServer {
uuid: "server1".to_string(),
address: Some("192.168.1.1".to_string()),
port: Some(80),
..Default::default()
};
haproxy.servers.servers.push(server);
// Call the function
let result = get_servers_for_backend(&backend, &haproxy);
// Check the result
assert_eq!(result, vec![]);
}
#[test]
fn test_get_servers_for_backend_multiple_linked_servers() {
// Create a backend with multiple linked servers
#[allow(clippy::field_reassign_with_default)]
let mut backend = HAProxyBackend::default();
backend.linked_servers.content = Some("server1,server2".to_string());
//
// Create an HAProxy instance with matching servers
let mut haproxy = HAProxy::default();
let server = HAProxyServer {
uuid: "server1".to_string(),
address: Some("some-hostname.test.mcd".to_string()),
port: Some(80),
..Default::default()
};
haproxy.servers.servers.push(server);
let server = HAProxyServer {
uuid: "server2".to_string(),
address: Some("192.168.1.2".to_string()),
port: Some(8080),
..Default::default()
};
haproxy.servers.servers.push(server);
// Call the function
let result = get_servers_for_backend(&backend, &haproxy);
// Check the result
assert_eq!(
result,
vec![
BackendServer {
address: "some-hostname.test.mcd".to_string(),
port: 80,
},
BackendServer {
address: "192.168.1.2".to_string(),
port: 8080,
},
]
);
}
}

View File

@@ -9,14 +9,17 @@ mod tftp;
use std::sync::Arc;
pub use management::*;
use tokio::sync::RwLock;
use cidr::Ipv4Cidr;
use crate::config::secret::{OPNSenseApiCredentials, OPNSenseFirewallCredentials};
use crate::topology::Router;
use crate::{executors::ExecutorError, topology::LogicalHost};
use harmony_types::net::IpAddress;
#[derive(Debug, Clone)]
pub struct OPNSenseFirewall {
opnsense_config: Arc<RwLock<opnsense_config::Config>>,
opnsense_config: Arc<opnsense_config::Config>,
host: LogicalHost,
}
@@ -25,27 +28,87 @@ impl OPNSenseFirewall {
self.host.ip
}
/// panics : if the opnsense config file cannot be loaded by the underlying opnsense_config
/// crate
pub async fn new(host: LogicalHost, port: Option<u16>, username: &str, password: &str) -> Self {
/// Create a new OPNSenseFirewall.
///
/// Requires both API credentials (for configuration CRUD) and SSH
/// credentials (for file uploads, PXE config).
///
/// API port defaults to 443
pub async fn new(
host: LogicalHost,
ssh_port: Option<u16>,
api_creds: &OPNSenseApiCredentials,
ssh_creds: &OPNSenseFirewallCredentials,
) -> Self {
Self::with_api_port(host, ssh_port, 443, api_creds, ssh_creds).await
}
/// Like [`new`] but with a custom API/web GUI port.
pub async fn with_api_port(
host: LogicalHost,
port: Option<u16>,
api_port: u16,
api_creds: &OPNSenseApiCredentials,
ssh_creds: &OPNSenseFirewallCredentials,
) -> Self {
let config = opnsense_config::Config::from_credentials_with_api_port(
host.ip,
port,
api_port,
&api_creds.key,
&api_creds.secret,
&ssh_creds.username,
&ssh_creds.password,
)
.await
.expect("Failed to create OPNsense config");
Self {
opnsense_config: Arc::new(RwLock::new(
opnsense_config::Config::from_credentials(host.ip, port, username, password).await,
)),
opnsense_config: Arc::new(config),
host,
}
}
pub fn get_opnsense_config(&self) -> Arc<RwLock<opnsense_config::Config>> {
pub fn get_opnsense_config(&self) -> Arc<opnsense_config::Config> {
self.opnsense_config.clone()
}
/// Test-only constructor from a pre-built `Config`.
///
/// Allows creating an `OPNSenseFirewall` backed by a mock HTTP server
/// without needing real credentials or SSH connections.
#[cfg(test)]
pub fn from_config(host: LogicalHost, config: opnsense_config::Config) -> Self {
Self {
opnsense_config: Arc::new(config),
host,
}
}
async fn commit_config(&self) -> Result<(), ExecutorError> {
// With the API backend, mutations are applied per-call.
// This is now a no-op for backward compatibility.
self.opnsense_config
.read()
.await
.apply()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))
}
}
impl Router for OPNSenseFirewall {
fn get_gateway(&self) -> IpAddress {
self.host.ip
}
fn get_cidr(&self) -> Ipv4Cidr {
let ipv4 = match self.host.ip {
IpAddress::V4(ip) => ip,
IpAddress::V6(_) => panic!("IPv6 not supported for OPNSense router"),
};
Ipv4Cidr::new(ipv4, 24).unwrap()
}
fn get_host(&self) -> LogicalHost {
self.host.clone()
}
}

View File

@@ -9,36 +9,33 @@ use crate::{
#[async_trait]
impl NodeExporter for OPNSenseFirewall {
async fn ensure_initialized(&self) -> Result<(), ExecutorError> {
let mut config = self.opnsense_config.write().await;
let node_exporter = config.node_exporter();
if let Some(config) = node_exporter.get_full_config() {
debug!(
"Node exporter available in opnsense config, assuming it is already installed. {config:?}"
);
if self.opnsense_config.node_exporter().is_installed().await {
debug!("Node exporter is installed");
} else {
config
self.opnsense_config
.install_package("os-node_exporter")
.await
.map_err(|e| {
ExecutorError::UnexpectedError(format!("Executor failed when trying to install os-node_exporter package with error {e:?}"
))
})?;
ExecutorError::UnexpectedError(format!(
"Failed to install os-node_exporter: {e:?}"
))
})?;
}
config
self.opnsense_config
.node_exporter()
.enable(true)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(())
}
async fn commit_config(&self) -> Result<(), ExecutorError> {
OPNSenseFirewall::commit_config(self).await
}
async fn reload_restart(&self) -> Result<(), ExecutorError> {
self.opnsense_config
.write()
.await
.node_exporter()
.reload_restart()
.await

View File

@@ -12,16 +12,21 @@ impl TftpServer for OPNSenseFirewall {
async fn serve_files(&self, url: &Url) -> Result<(), ExecutorError> {
let tftp_root_path = "/usr/local/tftp";
let config = self.opnsense_config.read().await;
info!("Uploading files from url {url} to {tftp_root_path}");
match url {
Url::LocalFolder(path) => {
config
self.opnsense_config
.upload_files(path, tftp_root_path)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Url::Url(url) => todo!("This url is not supported yet {url}"),
Url::Url(url) => {
let local_dir = super::http::download_url_to_temp_dir(url).await?;
self.opnsense_config
.upload_files(&local_dir, tftp_root_path)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
}
Ok(())
}
@@ -33,11 +38,10 @@ impl TftpServer for OPNSenseFirewall {
async fn set_ip(&self, ip: IpAddress) -> Result<(), ExecutorError> {
info!("Setting listen_ip to {}", &ip);
self.opnsense_config
.write()
.await
.tftp()
.listen_ip(&ip.to_string());
Ok(())
.listen_ip(&ip.to_string())
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))
}
async fn commit_config(&self) -> Result<(), ExecutorError> {
@@ -46,8 +50,6 @@ impl TftpServer for OPNSenseFirewall {
async fn reload_restart(&self) -> Result<(), ExecutorError> {
self.opnsense_config
.write()
.await
.tftp()
.reload_restart()
.await
@@ -55,22 +57,23 @@ impl TftpServer for OPNSenseFirewall {
}
async fn ensure_initialized(&self) -> Result<(), ExecutorError> {
let mut config = self.opnsense_config.write().await;
let tftp = config.tftp();
if tftp.get_full_config().is_none() {
info!("Tftp config not available in opnsense config, installing package");
config.install_package("os-tftp").await.map_err(|e| {
ExecutorError::UnexpectedError(format!(
"Executor failed when trying to install os-tftp package with error {e:?}"
))
})?;
if !self.opnsense_config.tftp().is_installed().await {
info!("TFTP not installed, installing os-tftp package");
self.opnsense_config
.install_package("os-tftp")
.await
.map_err(|e| {
ExecutorError::UnexpectedError(format!("Failed to install os-tftp: {e:?}"))
})?;
} else {
info!("Tftp config available in opnsense config, assuming it is already installed");
info!("TFTP config available, assuming it is already installed");
}
info!("Enabling tftp server");
config.tftp().enable(true);
Ok(())
self.opnsense_config
.tftp()
.enable(true)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))
}
}

View File

@@ -192,7 +192,7 @@ impl DhcpHostBindingInterpret {
for entry in dhcp_entries.into_iter() {
match dhcp_server.add_static_mapping(&entry).await {
Ok(_) => info!("Successfully registered DHCPStaticEntry {}", entry),
Err(_) => todo!(),
Err(e) => return Err(InterpretError::from(e)),
}
}

View File

@@ -0,0 +1,181 @@
use async_trait::async_trait;
use k8s_openapi::api::core::v1::{ConfigMap, Pod};
use kube::api::ListParams;
use log::{debug, info};
use serde::Serialize;
use crate::{
data::Version,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
topology::{K8sclient, Topology},
};
use harmony_types::id::Id;
/// A DNS rewrite rule mapping a hostname to a cluster service FQDN.
#[derive(Debug, Clone, Serialize)]
pub struct CoreDNSRewrite {
/// The hostname to intercept (e.g., `"sso.harmony.local"`).
pub hostname: String,
/// The cluster service FQDN to resolve to (e.g., `"zitadel.zitadel.svc.cluster.local"`).
pub target: String,
}
/// Score that patches CoreDNS to add `rewrite name` rules.
///
/// Useful when in-cluster pods need to reach services by their external
/// hostnames (e.g., for Zitadel Host header validation, or OpenBao JWT
/// auth fetching JWKS from Zitadel).
///
/// Only applies to K3sFamily and Default distributions. No-op on OpenShift
/// (which uses a different DNS operator).
///
/// Idempotent: existing rules are detected and skipped. CoreDNS pods are
/// restarted only when new rules are added.
#[derive(Debug, Clone, Serialize)]
pub struct CoreDNSRewriteScore {
pub rewrites: Vec<CoreDNSRewrite>,
}
impl<T: Topology + K8sclient> Score<T> for CoreDNSRewriteScore {
fn name(&self) -> String {
"CoreDNSRewriteScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<T>> {
Box::new(CoreDNSRewriteInterpret {
rewrites: self.rewrites.clone(),
})
}
}
#[derive(Debug, Clone)]
struct CoreDNSRewriteInterpret {
rewrites: Vec<CoreDNSRewrite>,
}
#[async_trait]
impl<T: Topology + K8sclient> Interpret<T> for CoreDNSRewriteInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &T,
) -> Result<Outcome, InterpretError> {
let k8s = topology
.k8s_client()
.await
.map_err(|e| InterpretError::new(format!("Failed to get K8s client: {e}")))?;
let distro = k8s
.get_k8s_distribution()
.await
.map_err(|e| InterpretError::new(format!("Failed to detect distribution: {e}")))?;
if !matches!(
distro,
harmony_k8s::KubernetesDistribution::K3sFamily
| harmony_k8s::KubernetesDistribution::Default
) {
return Ok(Outcome::noop(
"Skipping CoreDNS patch (not K3sFamily)".to_string(),
));
}
let cm: ConfigMap = k8s
.get_resource::<ConfigMap>("coredns", Some("kube-system"))
.await
.map_err(|e| InterpretError::new(format!("Failed to get coredns ConfigMap: {e}")))?
.ok_or_else(|| {
InterpretError::new("CoreDNS ConfigMap not found in kube-system".to_string())
})?;
let corefile = cm
.data
.as_ref()
.and_then(|d| d.get("Corefile"))
.ok_or_else(|| InterpretError::new("CoreDNS ConfigMap has no Corefile key".into()))?;
let mut new_rules = Vec::new();
for r in &self.rewrites {
if !corefile.contains(&format!("rewrite name {} {}", r.hostname, r.target)) {
new_rules.push(format!(" rewrite name {} {}", r.hostname, r.target));
}
}
if new_rules.is_empty() {
return Ok(Outcome::noop(
"CoreDNS rewrite rules already present".to_string(),
));
}
let patched = corefile.replacen(
".:53 {\n",
&format!(".:53 {{\n{}\n", new_rules.join("\n")),
1,
);
debug!("[CoreDNS] Patched Corefile:\n{}", patched);
// Use apply_dynamic with force_conflicts since the ConfigMap is
// owned by the cluster deployer (e.g., k3d) and server-side apply
// would conflict without force.
let patch_obj: kube::api::DynamicObject = serde_json::from_value(serde_json::json!({
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": { "name": "coredns", "namespace": "kube-system" },
"data": { "Corefile": patched }
}))
.map_err(|e| InterpretError::new(format!("Failed to build patch: {e}")))?;
k8s.apply_dynamic(&patch_obj, Some("kube-system"), true)
.await
.map_err(|e| InterpretError::new(format!("Failed to apply CoreDNS patch: {e}")))?;
// Restart CoreDNS pods to pick up the new config
let pods = k8s
.list_resources::<Pod>(
Some("kube-system"),
Some(ListParams::default().labels("k8s-app=kube-dns")),
)
.await
.map_err(|e| InterpretError::new(format!("Failed to list CoreDNS pods: {e}")))?;
for pod in pods.items {
if let Some(name) = &pod.metadata.name {
let _ = k8s.delete_resource::<Pod>(name, Some("kube-system")).await;
}
}
// Brief pause for pods to restart
tokio::time::sleep(tokio::time::Duration::from_secs(3)).await;
info!("[CoreDNS] Patched with {} rewrite rule(s)", new_rules.len());
Ok(Outcome {
status: InterpretStatus::SUCCESS,
message: format!("{} CoreDNS rewrite rule(s) applied", new_rules.len()),
details: self
.rewrites
.iter()
.map(|r| format!("{} -> {}", r.hostname, r.target))
.collect(),
})
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("CoreDNSRewrite")
}
fn get_version(&self) -> Version {
todo!()
}
fn get_status(&self) -> InterpretStatus {
todo!()
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -1,4 +1,5 @@
pub mod apps;
pub mod coredns;
pub mod deployment;
mod failover;
pub mod ingress;

View File

@@ -31,6 +31,9 @@ pub enum KvmError {
#[error("ISO download failed: {0}")]
IsoDownload(String),
#[error("command failed: {0}")]
CommandFailed(String),
#[error("libvirt error: {0}")]
Libvirt(#[from] virt::error::Error),

View File

@@ -1,4 +1,5 @@
use log::{debug, info, warn};
use std::net::IpAddr;
use virt::connect::Connect;
use virt::domain::Domain;
use virt::network::Network;
@@ -7,7 +8,7 @@ use virt::storage_vol::StorageVol;
use virt::sys;
use super::error::KvmError;
use super::types::{CdromConfig, NetworkConfig, VmConfig, VmStatus};
use super::types::{CdromConfig, NetworkConfig, VmConfig, VmInterface, VmStatus};
use super::xml;
/// A handle to a libvirt hypervisor.
@@ -199,6 +200,11 @@ impl KvmExecutor {
let dom = Domain::lookup_by_name(&conn, name).map_err(|_| KvmError::VmNotFound {
name: name.to_string(),
})?;
let (state, _) = dom.get_state()?;
if state == sys::VIR_DOMAIN_RUNNING || state == sys::VIR_DOMAIN_BLOCKED {
debug!("VM '{name}' is already running, skipping start");
return Ok(());
}
dom.create()?;
info!("VM '{name}' started");
Ok(())
@@ -292,12 +298,154 @@ impl KvmExecutor {
Ok(status)
}
/// Returns the first IPv4 address of a running VM, or `None` if no
/// address has been assigned yet.
///
/// Uses the libvirt lease/agent source to discover the IP. This requires
/// the VM to have obtained an address via DHCP from the libvirt network.
pub async fn vm_ip(&self, name: &str) -> Result<Option<IpAddr>, KvmError> {
let executor = self.clone();
let name = name.to_string();
tokio::task::spawn_blocking(move || executor.vm_ip_blocking(&name))
.await
.expect("blocking task panicked")
}
fn vm_ip_blocking(&self, name: &str) -> Result<Option<IpAddr>, KvmError> {
let conn = self.open_connection()?;
let dom = Domain::lookup_by_name(&conn, name).map_err(|_| KvmError::VmNotFound {
name: name.to_string(),
})?;
// Try lease-based source first (works with libvirt's built-in DHCP)
let interfaces = dom
.interface_addresses(sys::VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_LEASE, 0)
.unwrap_or_default();
for iface in &interfaces {
for addr in &iface.addrs {
// typed == 0 means IPv4 (AF_INET)
if addr.typed == 0 {
if let Ok(ip) = addr.addr.parse::<IpAddr>() {
return Ok(Some(ip));
}
}
}
}
Ok(None)
}
/// Polls until a VM has an IP address, with a timeout.
///
/// Returns the IP once available, or an error if the timeout is reached.
pub async fn wait_for_ip(
&self,
name: &str,
timeout: std::time::Duration,
) -> Result<IpAddr, KvmError> {
let deadline = tokio::time::Instant::now() + timeout;
loop {
if let Some(ip) = self.vm_ip(name).await? {
info!("VM '{name}' has IP: {ip}");
return Ok(ip);
}
if tokio::time::Instant::now() > deadline {
return Err(KvmError::Io(std::io::Error::new(
std::io::ErrorKind::TimedOut,
format!("VM '{name}' did not obtain an IP within {timeout:?}"),
)));
}
tokio::time::sleep(std::time::Duration::from_secs(3)).await;
}
}
// -------------------------------------------------------------------------
// NIC link control
// -------------------------------------------------------------------------
/// Set the link state of a VM's network interface.
///
/// Brings a NIC up or down by MAC address. Useful for preventing IP
/// conflicts when multiple VMs boot with the same default IP — disable
/// all NICs, then enable one at a time for sequential bootstrapping.
///
/// Uses `virsh domif-setlink` under the hood.
pub async fn set_interface_link(
&self,
vm_name: &str,
mac: &str,
up: bool,
) -> Result<(), KvmError> {
let state = if up { "up" } else { "down" };
info!("Setting {vm_name} interface {mac} link {state}");
let output = tokio::process::Command::new("virsh")
.args(["-c", &self.uri, "domif-setlink", vm_name, mac, state])
.output()
.await?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(KvmError::CommandFailed(format!(
"domif-setlink failed: {}",
stderr.trim()
)));
}
Ok(())
}
/// List all network interfaces of a VM with their MAC addresses.
///
/// Returns a list of `(interface_type, source, mac, model)` tuples.
pub async fn list_interfaces(&self, vm_name: &str) -> Result<Vec<VmInterface>, KvmError> {
let output = tokio::process::Command::new("virsh")
.args(["-c", &self.uri, "domiflist", vm_name])
.output()
.await?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(KvmError::CommandFailed(format!(
"domiflist failed: {}",
stderr.trim()
)));
}
let stdout = String::from_utf8_lossy(&output.stdout);
let mut interfaces = Vec::new();
for line in stdout.lines().skip(2) {
// virsh domiflist columns: Interface, Type, Source, Model, MAC
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 5 {
interfaces.push(VmInterface {
interface_type: parts[1].to_string(),
source: parts[2].to_string(),
model: parts[3].to_string(),
mac: parts[4].to_string(),
});
}
}
Ok(interfaces)
}
// -------------------------------------------------------------------------
// Storage
// -------------------------------------------------------------------------
fn create_volumes_blocking(&self, conn: &Connect, config: &VmConfig) -> Result<(), KvmError> {
for disk in &config.disks {
// Skip volume creation for disks with an existing source path
if disk.source_path.is_some() {
debug!(
"Disk '{}' uses existing source, skipping volume creation",
disk.device
);
continue;
}
let pool = StoragePool::lookup_by_name(conn, &disk.pool).map_err(|_| {
KvmError::StoragePoolNotFound {
name: disk.pool.clone(),

View File

@@ -8,6 +8,6 @@ pub mod types;
pub use error::KvmError;
pub use executor::KvmExecutor;
pub use types::{
BootDevice, CdromConfig, DiskConfig, ForwardMode, NetworkConfig, NetworkConfigBuilder,
NetworkRef, VmConfig, VmConfigBuilder, VmStatus,
BootDevice, CdromConfig, DhcpHost, DiskConfig, ForwardMode, NetworkConfig,
NetworkConfigBuilder, NetworkRef, VmConfig, VmConfigBuilder, VmInterface, VmStatus,
};

View File

@@ -1,5 +1,18 @@
use serde::{Deserialize, Serialize};
/// Information about a VM's network interface, as reported by `virsh domiflist`.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VmInterface {
/// Interface type (e.g. "network", "bridge")
pub interface_type: String,
/// Source network or bridge name
pub source: String,
/// Device model (e.g. "virtio")
pub model: String,
/// MAC address
pub mac: String,
}
/// Specifies how a KVM host is accessed.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum KvmConnectionUri {
@@ -24,12 +37,14 @@ impl KvmConnectionUri {
/// Configuration for a virtual disk attached to a VM.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DiskConfig {
/// Disk size in gigabytes.
/// Disk size in gigabytes. Ignored when `source_path` is set.
pub size_gb: u32,
/// Target device name in the guest (e.g. `vda`, `vdb`).
pub device: String,
/// Storage pool to allocate the volume from. Defaults to `"default"`.
pub pool: String,
/// When set, use this existing disk image instead of creating a new volume.
pub source_path: Option<String>,
}
/// Configuration for a CD-ROM/ISO device attached to a VM.
@@ -51,6 +66,18 @@ impl DiskConfig {
size_gb,
device,
pool: "default".to_string(),
source_path: None,
}
}
/// Use an existing disk image file instead of creating a new volume.
pub fn from_path(path: impl Into<String>, index: u8) -> Self {
let device = format!("vd{}", (b'a' + index) as char);
Self {
size_gb: 0,
device,
pool: String::new(),
source_path: Some(path.into()),
}
}
@@ -179,6 +206,13 @@ impl VmConfigBuilder {
self
}
/// Appends a disk backed by an existing qcow2/raw image file.
pub fn disk_from_path(mut self, path: impl Into<String>) -> Self {
let idx = self.disks.len() as u8;
self.disks.push(DiskConfig::from_path(path, idx));
self
}
/// Appends a disk with an explicit pool override.
pub fn disk_from_pool(mut self, size_gb: u32, pool: impl Into<String>) -> Self {
let idx = self.disks.len() as u8;
@@ -222,6 +256,17 @@ impl VmConfigBuilder {
}
}
/// A DHCP static host entry for a libvirt network.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DhcpHost {
/// MAC address (e.g. `"52:54:00:00:50:01"`).
pub mac: String,
/// IP to assign (e.g. `"10.50.0.2"`).
pub ip: String,
/// Optional hostname.
pub name: Option<String>,
}
/// Configuration for an isolated virtual network.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NetworkConfig {
@@ -235,6 +280,11 @@ pub struct NetworkConfig {
pub prefix_len: u8,
/// Forward mode. When `None`, the network is fully isolated.
pub forward_mode: Option<ForwardMode>,
/// Optional DHCP range (start, end). When set, libvirt's built-in
/// DHCP server hands out addresses in this range.
pub dhcp_range: Option<(String, String)>,
/// Static DHCP host entries for fixed IP assignment by MAC.
pub dhcp_hosts: Vec<DhcpHost>,
}
/// Libvirt network forward mode.
@@ -258,6 +308,8 @@ pub struct NetworkConfigBuilder {
gateway_ip: String,
prefix_len: u8,
forward_mode: Option<ForwardMode>,
dhcp_range: Option<(String, String)>,
dhcp_hosts: Vec<DhcpHost>,
}
impl NetworkConfigBuilder {
@@ -268,6 +320,8 @@ impl NetworkConfigBuilder {
gateway_ip: "192.168.100.1".to_string(),
prefix_len: 24,
forward_mode: Some(ForwardMode::Nat),
dhcp_range: None,
dhcp_hosts: vec![],
}
}
@@ -293,6 +347,27 @@ impl NetworkConfigBuilder {
self
}
/// Enable libvirt's built-in DHCP server with the given range.
pub fn dhcp_range(mut self, start: impl Into<String>, end: impl Into<String>) -> Self {
self.dhcp_range = Some((start.into(), end.into()));
self
}
/// Add a static DHCP host entry (MAC → fixed IP).
pub fn dhcp_host(
mut self,
mac: impl Into<String>,
ip: impl Into<String>,
name: Option<String>,
) -> Self {
self.dhcp_hosts.push(DhcpHost {
mac: mac.into(),
ip: ip.into(),
name,
});
self
}
pub fn build(self) -> NetworkConfig {
NetworkConfig {
bridge: self
@@ -302,6 +377,8 @@ impl NetworkConfigBuilder {
gateway_ip: self.gateway_ip,
prefix_len: self.prefix_len,
forward_mode: self.forward_mode,
dhcp_range: self.dhcp_range,
dhcp_hosts: self.dhcp_hosts,
}
}
}

View File

@@ -1,3 +1,40 @@
//! Libvirt XML generation via string templates.
//!
//! # Why string templates?
//!
//! These functions build libvirt domain, network, and volume XML as formatted
//! strings rather than typed structs. This is fragile — there is no compile-time
//! guarantee that the output is valid XML, and tests rely on substring matching
//! rather than structural validation.
//!
//! We investigated typed alternatives (evaluated 2026-03-24):
//!
//! - **`libvirt-rust-xml`** (gen branch, by Marc-André Lureau / Red Hat):
//! <https://gitlab.com/marcandre.lureau/libvirt-rust-xml/-/tree/gen>
//! Uses `relaxng-gen` (<https://github.com/elmarco/relaxng-rust>) to generate
//! Rust structs from libvirt's official RelaxNG schemas. This is the correct
//! long-term solution — zero maintenance burden, schema-validated, round-trip
//! serialization. However, as of commit `baca481`, `virtxml-domain` and
//! `virtxml-storage-volume` do not compile (missing modules + type inference
//! errors in the generated code). Only `virtxml-network` compiles.
//!
//! - **`libvirt-go-xml-module`** (Go, official libvirt project):
//! <https://gitlab.com/libvirt/libvirt-go-xml-module>
//! 572 hand-maintained typed structs for domain XML alone. MIT licensed.
//! Could be ported to Rust, but maintaining a manual port is the burden we
//! want to avoid.
//!
//! - **`virt` crate** (0.4.3, already in use):
//! C bindings to libvirt. Handles API calls but provides no XML typing —
//! `Domain::define_xml()` takes `&str`. This stays regardless of XML approach.
//!
//! # When to revisit
//!
//! Track the `libvirt-rust-xml` gen branch. When `virtxml-domain` compiles,
//! replace these templates with typed struct construction + `quick-xml`
//! serialization. The `VmConfig`/`NetworkConfig` builder API stays unchanged —
//! only the internal XML generation changes.
use super::types::{CdromConfig, DiskConfig, ForwardMode, NetworkConfig, VmConfig};
/// Renders the libvirt domain XML for a VM definition.
@@ -62,7 +99,10 @@ fn cdrom_devices(vm: &VmConfig) -> String {
}
fn format_disk(vm: &VmConfig, disk: &DiskConfig, image_dir: &str) -> String {
let path = format!("{image_dir}/{}-{}.qcow2", vm.name, disk.device);
let path = disk
.source_path
.clone()
.unwrap_or_else(|| format!("{image_dir}/{}-{}.qcow2", vm.name, disk.device));
format!(
r#" <disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
@@ -78,21 +118,15 @@ fn format_disk(vm: &VmConfig, disk: &DiskConfig, image_dir: &str) -> String {
fn format_cdrom(cdrom: &CdromConfig) -> String {
let source = &cdrom.source;
let dev = &cdrom.device;
let device_type = if source.starts_with("http://") || source.starts_with("https://") {
"cdrom"
} else {
"cdrom"
};
format!(
r#" <disk type='file' device='{device_type}'>
r#" <disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='{source}'/>
<target dev='{dev}' bus='ide'/>
<target dev='{dev}' bus='sata'/>
</disk>
"#,
source = source,
dev = dev,
device_type = device_type,
)
}
@@ -126,17 +160,42 @@ pub fn network_xml(cfg: &NetworkConfig) -> String {
None => "",
};
let dhcp = if cfg.dhcp_range.is_some() || !cfg.dhcp_hosts.is_empty() {
let mut dhcp_xml = String::from(" <dhcp>\n");
if let Some((start, end)) = &cfg.dhcp_range {
dhcp_xml.push_str(&format!(" <range start='{start}' end='{end}'/>\n"));
}
for host in &cfg.dhcp_hosts {
let name_attr = host
.name
.as_deref()
.map(|n| format!(" name='{n}'"))
.unwrap_or_default();
dhcp_xml.push_str(&format!(
" <host mac='{mac}'{name_attr} ip='{ip}'/>\n",
mac = host.mac,
ip = host.ip,
));
}
dhcp_xml.push_str(" </dhcp>\n");
dhcp_xml
} else {
String::new()
};
format!(
r#"<network>
<name>{name}</name>
<bridge name='{bridge}' stp='on' delay='0'/>
{forward} <ip address='{gateway}' prefix='{prefix}'/>
{forward} <ip address='{gateway}' prefix='{prefix}'>
{dhcp} </ip>
</network>"#,
name = cfg.name,
bridge = cfg.bridge,
forward = forward,
gateway = cfg.gateway_ip,
prefix = cfg.prefix_len,
dhcp = dhcp,
)
}
@@ -159,7 +218,11 @@ pub fn volume_xml(name: &str, size_gb: u32) -> String {
#[cfg(test)]
mod tests {
use super::*;
use crate::modules::kvm::types::{BootDevice, NetworkRef, VmConfig};
use crate::modules::kvm::types::{
BootDevice, ForwardMode, NetworkConfig, NetworkRef, VmConfig,
};
// ── Domain XML ──────────────────────────────────────────────────────
#[test]
fn domain_xml_contains_vm_name() {
@@ -179,9 +242,100 @@ mod tests {
}
#[test]
fn network_xml_isolated_has_no_forward() {
use crate::modules::kvm::types::NetworkConfig;
fn domain_xml_memory_conversion() {
let vm = VmConfig::builder("mem-test").memory_gb(8).build();
let xml = domain_xml(&vm, "/tmp");
// 8 GB = 8 * 1024 MiB = 8192 MiB = 8388608 KiB
assert!(xml.contains("<memory unit='KiB'>8388608</memory>"));
}
#[test]
fn domain_xml_multiple_disks() {
let vm = VmConfig::builder("multi-disk")
.disk(120) // vda
.disk(200) // vdb
.disk(500) // vdc
.build();
let xml = domain_xml(&vm, "/images");
assert!(xml.contains("multi-disk-vda.qcow2"));
assert!(xml.contains("multi-disk-vdb.qcow2"));
assert!(xml.contains("multi-disk-vdc.qcow2"));
assert!(xml.contains("dev='vda'"));
assert!(xml.contains("dev='vdb'"));
assert!(xml.contains("dev='vdc'"));
}
#[test]
fn domain_xml_multiple_nics() {
let vm = VmConfig::builder("multi-nic")
.network(NetworkRef::named("default"))
.network(NetworkRef::named("management"))
.network(NetworkRef::named("storage"))
.build();
let xml = domain_xml(&vm, "/tmp");
assert!(xml.contains("source network='default'"));
assert!(xml.contains("source network='management'"));
assert!(xml.contains("source network='storage'"));
// All NICs should be virtio
assert_eq!(xml.matches("model type='virtio'").count(), 3);
}
#[test]
fn domain_xml_nic_with_mac_address() {
let vm = VmConfig::builder("mac-test")
.network(NetworkRef::named("mynet").with_mac("52:54:00:AA:BB:CC"))
.build();
let xml = domain_xml(&vm, "/tmp");
assert!(xml.contains("mac address='52:54:00:AA:BB:CC'"));
}
#[test]
fn domain_xml_cdrom_device() {
let vm = VmConfig::builder("iso-test")
.cdrom("/path/to/image.iso")
.boot_order([BootDevice::Cdrom, BootDevice::Disk])
.build();
let xml = domain_xml(&vm, "/tmp");
assert!(xml.contains("device='cdrom'"));
assert!(xml.contains("source file='/path/to/image.iso'"));
assert!(xml.contains("bus='sata'"));
assert!(xml.contains("boot dev='cdrom'"));
}
#[test]
fn domain_xml_q35_machine_type() {
let vm = VmConfig::builder("q35-test").build();
let xml = domain_xml(&vm, "/tmp");
assert!(xml.contains("machine='q35'"));
assert!(xml.contains("<acpi/>"));
assert!(xml.contains("<apic/>"));
assert!(xml.contains("mode='host-model'"));
}
#[test]
fn domain_xml_serial_console() {
let vm = VmConfig::builder("console-test").build();
let xml = domain_xml(&vm, "/tmp");
assert!(xml.contains("<serial type='pty'>"));
assert!(xml.contains("<console type='pty'>"));
}
#[test]
fn domain_xml_empty_boot_order() {
let vm = VmConfig::builder("no-boot").build();
let xml = domain_xml(&vm, "/tmp");
// No boot entries should be present
assert!(!xml.contains("boot dev="));
}
// ── Network XML ─────────────────────────────────────────────────────
#[test]
fn network_xml_isolated_has_no_forward() {
let cfg = NetworkConfig::builder("testnet")
.subnet("10.0.0.1", 24)
.isolated()
@@ -190,5 +344,144 @@ mod tests {
let xml = network_xml(&cfg);
assert!(!xml.contains("<forward"));
assert!(xml.contains("10.0.0.1"));
assert!(xml.contains("prefix='24'"));
}
#[test]
fn network_xml_nat_mode() {
let cfg = NetworkConfig::builder("natnet")
.subnet("192.168.200.1", 24)
.forward(ForwardMode::Nat)
.build();
let xml = network_xml(&cfg);
assert!(xml.contains("<forward mode='nat'/>"));
assert!(xml.contains("192.168.200.1"));
}
#[test]
fn network_xml_route_mode() {
let cfg = NetworkConfig::builder("routenet")
.subnet("10.10.0.1", 16)
.forward(ForwardMode::Route)
.build();
let xml = network_xml(&cfg);
assert!(xml.contains("<forward mode='route'/>"));
assert!(xml.contains("prefix='16'"));
}
#[test]
fn network_xml_custom_bridge() {
let cfg = NetworkConfig::builder("custom")
.bridge("br-custom")
.subnet("172.16.0.1", 24)
.build();
let xml = network_xml(&cfg);
assert!(xml.contains("name='br-custom'"));
}
#[test]
fn network_xml_auto_bridge_name() {
let cfg = NetworkConfig::builder("harmony-test").isolated().build();
// Bridge auto-generated: virbr-{name} with hyphens removed from name
assert_eq!(cfg.bridge, "virbr-harmonytest");
}
// ── Volume XML ──────────────────────────────────────────────────────
#[test]
fn volume_xml_size_calculation() {
let xml = volume_xml("test-vol", 100);
// 100 GB = 100 * 1024^3 bytes = 107374182400
assert!(xml.contains("<capacity unit='bytes'>107374182400</capacity>"));
assert!(xml.contains("<name>test-vol.qcow2</name>"));
assert!(xml.contains("type='qcow2'"));
}
// ── Builder defaults ────────────────────────────────────────────────
#[test]
fn vm_builder_defaults() {
let vm = VmConfig::builder("defaults").build();
assert_eq!(vm.name, "defaults");
assert_eq!(vm.vcpus, 2);
assert_eq!(vm.memory_mib, 4096);
assert!(vm.disks.is_empty());
assert!(vm.networks.is_empty());
assert!(vm.cdroms.is_empty());
assert!(vm.boot_order.is_empty());
}
#[test]
fn network_builder_defaults() {
let net = NetworkConfig::builder("testnet").build();
assert_eq!(net.name, "testnet");
assert_eq!(net.gateway_ip, "192.168.100.1");
assert_eq!(net.prefix_len, 24);
assert!(matches!(net.forward_mode, Some(ForwardMode::Nat)));
}
#[test]
fn disk_sequential_naming() {
let vm = VmConfig::builder("seq")
.disk(10)
.disk(20)
.disk(30)
.disk(40)
.build();
assert_eq!(vm.disks[0].device, "vda");
assert_eq!(vm.disks[1].device, "vdb");
assert_eq!(vm.disks[2].device, "vdc");
assert_eq!(vm.disks[3].device, "vdd");
assert_eq!(vm.disks[0].size_gb, 10);
assert_eq!(vm.disks[3].size_gb, 40);
}
#[test]
fn network_xml_with_dhcp_range() {
let cfg = NetworkConfig::builder("dhcpnet")
.subnet("10.50.0.1", 24)
.dhcp_range("10.50.0.100", "10.50.0.200")
.build();
let xml = network_xml(&cfg);
assert!(xml.contains("<dhcp>"));
assert!(xml.contains("range start='10.50.0.100' end='10.50.0.200'"));
}
#[test]
fn network_xml_with_dhcp_host() {
let cfg = NetworkConfig::builder("hostnet")
.subnet("10.50.0.1", 24)
.dhcp_range("10.50.0.100", "10.50.0.200")
.dhcp_host(
"52:54:00:00:50:01",
"10.50.0.2",
Some("opnsense".to_string()),
)
.build();
let xml = network_xml(&cfg);
assert!(xml.contains("host mac='52:54:00:00:50:01'"));
assert!(xml.contains("name='opnsense'"));
assert!(xml.contains("ip='10.50.0.2'"));
}
#[test]
fn network_xml_no_dhcp_by_default() {
let cfg = NetworkConfig::builder("nodhcp").build();
let xml = network_xml(&cfg);
assert!(!xml.contains("<dhcp>"));
}
#[test]
fn disk_custom_pool() {
let vm = VmConfig::builder("pool-test")
.disk_from_pool(100, "ssd-pool")
.build();
assert_eq!(vm.disks[0].pool, "ssd-pool");
}
}

View File

@@ -19,6 +19,12 @@ pub struct LoadBalancerScore {
// (listen_interface, LoadBalancerService) tuples or something like that
// I am not sure what to use as listen_interface, should it be interface name, ip address,
// uuid?
/// TCP ports that must be open for inbound WAN traffic.
///
/// The load balancer interpret will call `ensure_wan_access` for each port
/// before configuring services, so that the load balancer is reachable
/// from outside the LAN.
pub wan_firewall_ports: Vec<u16>,
}
impl<T: Topology + LoadBalancer> Score<T> for LoadBalancerScore {
@@ -60,6 +66,11 @@ impl<T: Topology + LoadBalancer> Interpret<T> for LoadBalancerInterpret {
load_balancer.ensure_initialized().await?
);
for port in &self.score.wan_firewall_ports {
info!("Ensuring WAN access for port {port}");
load_balancer.ensure_wan_access(*port).await?;
}
for service in self.score.public_services.iter() {
info!("Ensuring service exists {service:?}");

View File

@@ -20,6 +20,7 @@ use async_trait::async_trait;
use derive_new::new;
use harmony_secret::SecretManager;
use harmony_types::id::Id;
use harmony_types::net::Url;
use log::{debug, info};
use serde::Serialize;
use std::path::PathBuf;
@@ -103,7 +104,7 @@ impl OKDSetup02BootstrapInterpret {
)));
} else {
info!(
"Created OKD installation directory {}",
"[Stage 02/Bootstrap] Created OKD installation directory {}",
okd_installation_path.to_string_lossy()
);
}
@@ -135,7 +136,7 @@ impl OKDSetup02BootstrapInterpret {
self.create_file(&install_config_backup, install_config_yaml.as_bytes())
.await?;
info!("Creating manifest files with openshift-install");
info!("[Stage 02/Bootstrap] Creating manifest files with openshift-install");
let output = Command::new(okd_bin_path.join("openshift-install"))
.args([
"create",
@@ -147,10 +148,19 @@ impl OKDSetup02BootstrapInterpret {
.await
.map_err(|e| InterpretError::new(format!("Failed to create okd manifest : {e}")))?;
let stdout = String::from_utf8(output.stdout).unwrap();
info!("openshift-install stdout :\n\n{}", stdout);
info!(
"[Stage 02/Bootstrap] openshift-install stdout :\n\n{}",
stdout
);
let stderr = String::from_utf8(output.stderr).unwrap();
info!("openshift-install stderr :\n\n{}", stderr);
info!("openshift-install exit status : {}", output.status);
info!(
"[Stage 02/Bootstrap] openshift-install stderr :\n\n{}",
stderr
);
info!(
"[Stage 02/Bootstrap] openshift-install exit status : {}",
output.status
);
if !output.status.success() {
return Err(InterpretError::new(format!(
"Failed to create okd manifest, exit code {} : {}",
@@ -158,7 +168,7 @@ impl OKDSetup02BootstrapInterpret {
)));
}
info!("Creating ignition files with openshift-install");
info!("[Stage 02/Bootstrap] Creating ignition files with openshift-install");
let output = Command::new(okd_bin_path.join("openshift-install"))
.args([
"create",
@@ -172,10 +182,19 @@ impl OKDSetup02BootstrapInterpret {
InterpretError::new(format!("Failed to create okd ignition config : {e}"))
})?;
let stdout = String::from_utf8(output.stdout).unwrap();
info!("openshift-install stdout :\n\n{}", stdout);
info!(
"[Stage 02/Bootstrap] openshift-install stdout :\n\n{}",
stdout
);
let stderr = String::from_utf8(output.stderr).unwrap();
info!("openshift-install stderr :\n\n{}", stderr);
info!("openshift-install exit status : {}", output.status);
info!(
"[Stage 02/Bootstrap] openshift-install stderr :\n\n{}",
stderr
);
info!(
"[Stage 02/Bootstrap] openshift-install exit status : {}",
output.status
);
if !output.status.success() {
return Err(InterpretError::new(format!(
"Failed to create okd manifest, exit code {} : {}",
@@ -189,7 +208,7 @@ impl OKDSetup02BootstrapInterpret {
let remote_path = ignition_files_http_path.join(filename);
info!(
"Preparing file content for local file : {} to remote : {}",
"[Stage 02/Bootstrap] Preparing ignition file : {} -> {}",
local_path.to_string_lossy(),
remote_path.to_string_lossy()
);
@@ -220,25 +239,27 @@ impl OKDSetup02BootstrapInterpret {
.interpret(inventory, topology)
.await?;
info!("Successfully prepared ignition files for OKD installation");
// ignition_files_http_path // = PathBuf::from("okd_ignition_files");
info!("[Stage 02/Bootstrap] Successfully prepared ignition files for OKD installation");
info!(
r#"Uploading images, they can be refreshed with a command similar to this one: openshift-install coreos print-stream-json | grep -Eo '"https.*(kernel.|initramfs.|rootfs.)\w+(\.img)?"' | grep x86_64 | xargs -n 1 curl -LO"#
"[Stage 02/Bootstrap] Uploading SCOS installer images from {} to HTTP server",
okd_images_path.to_string_lossy()
);
info!(
r#"[Stage 02/Bootstrap] Images can be refreshed with: openshift-install coreos print-stream-json | grep -Eo '"https.*(kernel.|initramfs.|rootfs.)\w+(\.img)?"' | grep x86_64 | xargs -n 1 curl -LO"#
);
inquire::Confirm::new(
&format!("push installer image files with `scp -r {}/* root@{}:/usr/local/http/scos/` until performance issue is resolved", okd_images_path.to_string_lossy(), topology.http_server.get_ip())).prompt().expect("Prompt error");
StaticFilesHttpScore {
folder_to_serve: Some(Url::LocalFolder(
okd_images_path.to_string_lossy().to_string(),
)),
remote_path: Some("scos".to_string()),
files: vec![],
}
.interpret(inventory, topology)
.await?;
// let scos_http_path = PathBuf::from("scos");
// StaticFilesHttpScore {
// folder_to_serve: Some(Url::LocalFolder(
// okd_images_path.to_string_lossy().to_string(),
// )),
// remote_path: Some(scos_http_path.to_string_lossy().to_string()),
// files: vec![],
// }
// .interpret(inventory, topology)
// .await?;
info!("[Stage 02/Bootstrap] SCOS images uploaded successfully");
Ok(())
}
@@ -255,7 +276,7 @@ impl OKDSetup02BootstrapInterpret {
physical_host,
host_config,
};
info!("Configuring host binding for bootstrap node {binding:?}");
info!("[Stage 02/Bootstrap] Configuring host binding for bootstrap node {binding:?}");
DhcpHostBindingScore {
host_binding: vec![binding],
@@ -308,7 +329,7 @@ impl OKDSetup02BootstrapInterpret {
let outcome = OKDBootstrapLoadBalancerScore::new(topology)
.interpret(inventory, topology)
.await?;
info!("Successfully executed OKDBootstrapLoadBalancerScore : {outcome:?}");
info!("[Stage 02/Bootstrap] Load balancer configured: {outcome:?}");
Ok(())
}
@@ -325,10 +346,52 @@ impl OKDSetup02BootstrapInterpret {
Ok(())
}
async fn wait_for_bootstrap_complete(&self) -> Result<(), InterpretError> {
// Placeholder: wait-for bootstrap-complete
info!("[Bootstrap] Waiting for bootstrap-complete …");
todo!("[Bootstrap] Waiting for bootstrap-complete …")
async fn wait_for_bootstrap_complete(
&self,
inventory: &Inventory,
) -> Result<(), InterpretError> {
info!("[Stage 02/Bootstrap] Waiting for bootstrap to complete...");
info!("[Stage 02/Bootstrap] Running: openshift-install wait-for bootstrap-complete");
let okd_installation_path =
format!("./data/okd/installation_files_{}", inventory.location.name);
let output = Command::new("./data/okd/bin/openshift-install")
.args([
"wait-for",
"bootstrap-complete",
"--dir",
&okd_installation_path,
"--log-level=info",
])
.output()
.await
.map_err(|e| {
InterpretError::new(format!(
"[Stage 02/Bootstrap] Failed to run openshift-install wait-for bootstrap-complete: {e}"
))
})?;
let stdout = String::from_utf8_lossy(&output.stdout);
let stderr = String::from_utf8_lossy(&output.stderr);
if !stdout.is_empty() {
info!("[Stage 02/Bootstrap] openshift-install stdout:\n{stdout}");
}
if !stderr.is_empty() {
info!("[Stage 02/Bootstrap] openshift-install stderr:\n{stderr}");
}
if !output.status.success() {
return Err(InterpretError::new(format!(
"[Stage 02/Bootstrap] bootstrap-complete failed (exit {}): {}",
output.status,
stderr.lines().last().unwrap_or("unknown error")
)));
}
info!("[Stage 02/Bootstrap] Bootstrap complete!");
Ok(())
}
async fn create_file(&self, path: &PathBuf, content: &[u8]) -> Result<(), InterpretError> {
@@ -381,7 +444,7 @@ impl Interpret<HAClusterTopology> for OKDSetup02BootstrapInterpret {
// self.validate_dns_config(inventory, topology).await?;
self.reboot_target().await?;
self.wait_for_bootstrap_complete().await?;
self.wait_for_bootstrap_complete(inventory).await?;
Ok(Outcome::success("Bootstrap phase complete".into()))
}

View File

@@ -1,4 +1,4 @@
use std::net::SocketAddr;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use serde::Serialize;
@@ -19,27 +19,30 @@ pub struct OKDBootstrapLoadBalancerScore {
impl OKDBootstrapLoadBalancerScore {
pub fn new(topology: &HAClusterTopology) -> Self {
let private_ip = topology.router.get_gateway();
// Bind on 0.0.0.0 instead of the LAN IP to avoid CARP VIP race
// conditions where HAProxy fails to bind when the interface
// transitions back to master.
let bind_addr = IpAddr::V4(Ipv4Addr::UNSPECIFIED);
let private_services = vec![
LoadBalancerService {
backend_servers: Self::topology_to_backend_server(topology, 80),
listening_port: SocketAddr::new(private_ip, 80),
listening_port: SocketAddr::new(bind_addr, 80),
health_check: Some(HealthCheck::TCP(None)),
},
LoadBalancerService {
backend_servers: Self::topology_to_backend_server(topology, 443),
listening_port: SocketAddr::new(private_ip, 443),
listening_port: SocketAddr::new(bind_addr, 443),
health_check: Some(HealthCheck::TCP(None)),
},
LoadBalancerService {
backend_servers: Self::topology_to_backend_server(topology, 22623),
listening_port: SocketAddr::new(private_ip, 22623),
listening_port: SocketAddr::new(bind_addr, 22623),
health_check: Some(HealthCheck::TCP(None)),
},
LoadBalancerService {
backend_servers: Self::topology_to_backend_server(topology, 6443),
listening_port: SocketAddr::new(private_ip, 6443),
listening_port: SocketAddr::new(bind_addr, 6443),
health_check: Some(HealthCheck::HTTP(
None,
"/readyz".to_string(),
@@ -53,6 +56,7 @@ impl OKDBootstrapLoadBalancerScore {
load_balancer_score: LoadBalancerScore {
public_services: vec![],
private_services,
wan_firewall_ports: vec![80, 443],
},
}
}

View File

@@ -78,9 +78,9 @@ impl OKDNodeInterpret {
let required_hosts: i16 = okd_host_properties.required_hosts();
info!(
"Discovery of {} {} hosts in progress, current number {}",
required_hosts,
"[{}] Discovery of {} hosts in progress, {} found so far",
self.host_role,
required_hosts,
hosts.len()
);
// This score triggers the discovery agent for a specific role.
@@ -118,8 +118,9 @@ impl OKDNodeInterpret {
nodes: &Vec<(PhysicalHost, HostConfig)>,
) -> Result<(), InterpretError> {
info!(
"[{}] Configuring host bindings for {} plane nodes.",
self.host_role, self.host_role,
"[{}] Configuring DHCP host bindings for {} nodes",
self.host_role,
nodes.len()
);
let host_properties = self.okd_role_properties(&self.host_role);
@@ -296,14 +297,18 @@ impl Interpret<HAClusterTopology> for OKDNodeInterpret {
// and the cluster becomes fully functional only once all nodes are Ready and the
// cluster operators report Available=True.
info!(
"[{}] Provisioning initiated. Monitor the cluster convergence manually.",
self.host_role
"[{}] Provisioning initiated for {} nodes. Monitor cluster convergence with: oc get nodes && oc get co",
self.host_role,
nodes.len()
);
Ok(Outcome::success(format!(
"{} provisioning has been successfully initiated.",
self.host_role
)))
Ok(Outcome::success_with_details(
format!("{} provisioning initiated", self.host_role),
nodes
.iter()
.map(|(host, _)| format!(" {} (MACs: {:?})", host.id, host.get_mac_address()))
.collect(),
))
}
fn get_name(&self) -> InterpretName {

View File

@@ -74,14 +74,7 @@ impl<T: Topology + DhcpServer + TftpServer + HttpServer + Router> Interpret<T>
}),
Box::new(StaticFilesHttpScore {
remote_path: None,
// TODO The current russh based copy is way too slow, check for a lib update or use scp
// when available
//
// For now just run :
// scp -r data/pxe/okd/http_files/* root@192.168.1.1:/usr/local/http/
//
folder_to_serve: None,
// folder_to_serve: Some(Url::LocalFolder("./data/pxe/okd/http_files/".to_string())),
folder_to_serve: Some(Url::LocalFolder("./data/pxe/okd/http_files/".to_string())),
files: vec![
FileContent {
path: FilePath::Relative("boot.ipxe".to_string()),
@@ -123,9 +116,9 @@ impl<T: Topology + DhcpServer + TftpServer + HttpServer + Router> Interpret<T>
Err(e) => return Err(e),
};
}
inquire::Confirm::new(&format!("Execute the copy : `scp -r data/pxe/okd/http_files/* root@{}:/usr/local/http/` and confirm when done to continue", HttpServer::get_ip(topology))).prompt().expect("Prompt error");
Ok(Outcome::success("Ipxe installed".to_string()))
Ok(Outcome::success(
"iPXE boot infrastructure installed".to_string(),
))
}
fn get_name(&self) -> InterpretName {

View File

@@ -1,4 +1,4 @@
use std::net::SocketAddr;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use serde::Serialize;
@@ -8,7 +8,7 @@ use crate::{
score::Score,
topology::{
BackendServer, HAClusterTopology, HealthCheck, HttpMethod, HttpStatusCode, LoadBalancer,
LoadBalancerService, LogicalHost, Router, SSL, Topology,
LoadBalancerService, SSL, Topology,
},
};
@@ -53,16 +53,19 @@ pub struct OKDLoadBalancerScore {
/// ```
impl OKDLoadBalancerScore {
pub fn new(topology: &HAClusterTopology) -> Self {
let public_ip = topology.router.get_gateway();
// Bind on 0.0.0.0 instead of the LAN IP to avoid CARP VIP race
// conditions where HAProxy fails to bind when the interface
// transitions back to master.
let bind_addr = IpAddr::V4(Ipv4Addr::UNSPECIFIED);
let public_services = vec![
LoadBalancerService {
backend_servers: Self::nodes_to_backend_server(topology, 80),
listening_port: SocketAddr::new(public_ip, 80),
listening_port: SocketAddr::new(bind_addr, 80),
health_check: None,
},
LoadBalancerService {
backend_servers: Self::nodes_to_backend_server(topology, 443),
listening_port: SocketAddr::new(public_ip, 443),
listening_port: SocketAddr::new(bind_addr, 443),
health_check: None,
},
];
@@ -70,7 +73,7 @@ impl OKDLoadBalancerScore {
let private_services = vec![
LoadBalancerService {
backend_servers: Self::nodes_to_backend_server(topology, 80),
listening_port: SocketAddr::new(public_ip, 80),
listening_port: SocketAddr::new(bind_addr, 80),
health_check: Some(HealthCheck::HTTP(
Some(25001),
"/health?check=okd_router_1936,node_ready".to_string(),
@@ -81,7 +84,7 @@ impl OKDLoadBalancerScore {
},
LoadBalancerService {
backend_servers: Self::nodes_to_backend_server(topology, 443),
listening_port: SocketAddr::new(public_ip, 443),
listening_port: SocketAddr::new(bind_addr, 443),
health_check: Some(HealthCheck::HTTP(
Some(25001),
"/health?check=okd_router_1936,node_ready".to_string(),
@@ -92,12 +95,12 @@ impl OKDLoadBalancerScore {
},
LoadBalancerService {
backend_servers: Self::control_plane_to_backend_server(topology, 22623),
listening_port: SocketAddr::new(public_ip, 22623),
listening_port: SocketAddr::new(bind_addr, 22623),
health_check: Some(HealthCheck::TCP(None)),
},
LoadBalancerService {
backend_servers: Self::control_plane_to_backend_server(topology, 6443),
listening_port: SocketAddr::new(public_ip, 6443),
listening_port: SocketAddr::new(bind_addr, 6443),
health_check: Some(HealthCheck::HTTP(
None,
"/readyz".to_string(),
@@ -111,6 +114,7 @@ impl OKDLoadBalancerScore {
load_balancer_score: LoadBalancerScore {
public_services,
private_services,
wan_firewall_ports: vec![80, 443],
},
}
}
@@ -165,7 +169,7 @@ mod tests {
use std::sync::{Arc, OnceLock};
use super::*;
use crate::topology::DummyInfra;
use crate::topology::{DummyInfra, LogicalHost, Router};
use harmony_macros::ip;
use harmony_types::net::IpAddress;
@@ -296,6 +300,30 @@ mod tests {
assert_eq!(public_service_443.backend_servers.len(), 5);
}
#[test]
fn test_all_services_bind_on_unspecified_address() {
let topology = create_test_topology();
let score = OKDLoadBalancerScore::new(&topology);
let unspecified = IpAddr::V4(Ipv4Addr::UNSPECIFIED);
for svc in &score.load_balancer_score.public_services {
assert_eq!(
svc.listening_port.ip(),
unspecified,
"Public service on port {} should bind on 0.0.0.0",
svc.listening_port.port()
);
}
for svc in &score.load_balancer_score.private_services {
assert_eq!(
svc.listening_port.ip(),
unspecified,
"Private service on port {} should bind on 0.0.0.0",
svc.listening_port.port()
);
}
}
#[test]
fn test_private_service_port_22623_only_control_plane() {
let topology = create_test_topology();
@@ -311,6 +339,13 @@ mod tests {
assert_eq!(private_service_22623.backend_servers.len(), 3);
}
#[test]
fn test_wan_firewall_ports_include_http_and_https() {
let topology = create_test_topology();
let score = OKDLoadBalancerScore::new(&topology);
assert_eq!(score.load_balancer_score.wan_firewall_ports, vec![80, 443]);
}
#[test]
fn test_all_backend_servers_have_correct_port() {
let topology = create_test_topology();

View File

@@ -1,3 +1,5 @@
pub mod setup;
use std::str::FromStr;
use harmony_macros::hurl;
@@ -11,10 +13,15 @@ use crate::{
topology::{HelmCommand, K8sclient, Topology},
};
pub use setup::{OpenbaoJwtAuth, OpenbaoPolicy, OpenbaoSetupScore, OpenbaoUser};
#[derive(Debug, Serialize, Clone)]
pub struct OpenbaoScore {
/// Host used for external access (ingress)
pub host: String,
/// Set to true when deploying to OpenShift. Defaults to false for k3d/Kubernetes.
#[serde(default)]
pub openshift: bool,
}
impl<T: Topology + K8sclient + HelmCommand> Score<T> for OpenbaoScore {
@@ -24,12 +31,12 @@ impl<T: Topology + K8sclient + HelmCommand> Score<T> for OpenbaoScore {
#[doc(hidden)]
fn create_interpret(&self) -> Box<dyn Interpret<T>> {
// TODO exec pod commands to initialize secret store if not already done
let host = &self.host;
let openshift = self.openshift;
let values_yaml = Some(format!(
r#"global:
openshift: true
openshift: {openshift}
server:
standalone:
enabled: true

View File

@@ -0,0 +1,527 @@
use std::path::PathBuf;
use async_trait::async_trait;
use log::{info, warn};
use serde::{Deserialize, Serialize};
use crate::{
data::Version,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
topology::{K8sclient, Topology},
};
use harmony_types::id::Id;
const DEFAULT_NAMESPACE: &str = "openbao";
const DEFAULT_POD: &str = "openbao-0";
const DEFAULT_KV_MOUNT: &str = "secret";
/// A policy to create in OpenBao.
#[derive(Debug, Clone, Serialize)]
pub struct OpenbaoPolicy {
pub name: String,
pub hcl: String,
}
/// A userpass user to create in OpenBao.
#[derive(Debug, Clone, Serialize)]
pub struct OpenbaoUser {
pub username: String,
pub password: String,
pub policies: Vec<String>,
}
/// JWT auth method configuration for OpenBao.
#[derive(Debug, Clone, Serialize)]
pub struct OpenbaoJwtAuth {
pub oidc_discovery_url: String,
pub bound_issuer: String,
pub role_name: String,
pub bound_audiences: String,
pub user_claim: String,
pub policies: Vec<String>,
pub ttl: String,
pub max_ttl: String,
}
/// Score that initializes, unseals, and configures an already-deployed OpenBao
/// instance.
///
/// This Score handles the operational lifecycle that follows the Helm
/// deployment (handled by [`OpenbaoScore`]):
///
/// 1. **Init** — `bao operator init`, stores unseal keys locally
/// 2. **Unseal** — applies stored unseal keys (3 of 5 by default)
/// 3. **KV v2** — enables the versioned KV secrets engine
/// 4. **Policies** — creates configurable access policies
/// 5. **Userpass** — creates dev/operator users with assigned policies
/// 6. **JWT auth** — (optional) configures JWT auth for OIDC-based access
///
/// All steps are idempotent: re-running skips already-completed work.
///
/// Unseal keys are cached at `~/.local/share/harmony/openbao/unseal-keys.json`
/// (with `0600` permissions on Unix). This is a development convenience; production
/// deployments should use auto-unseal (Transit, cloud KMS, etc.).
#[derive(Debug, Clone, Serialize)]
pub struct OpenbaoSetupScore {
/// Kubernetes namespace where OpenBao is deployed.
#[serde(default = "default_namespace")]
pub namespace: String,
/// StatefulSet pod name to exec into.
#[serde(default = "default_pod")]
pub pod: String,
/// KV v2 mount path to enable.
#[serde(default = "default_kv_mount")]
pub kv_mount: String,
/// Policies to create.
#[serde(default)]
pub policies: Vec<OpenbaoPolicy>,
/// Userpass users to create.
#[serde(default)]
pub users: Vec<OpenbaoUser>,
/// Optional JWT auth configuration (e.g., for Zitadel OIDC).
#[serde(default)]
pub jwt_auth: Option<OpenbaoJwtAuth>,
}
fn default_namespace() -> String {
DEFAULT_NAMESPACE.to_string()
}
fn default_pod() -> String {
DEFAULT_POD.to_string()
}
fn default_kv_mount() -> String {
DEFAULT_KV_MOUNT.to_string()
}
impl Default for OpenbaoSetupScore {
fn default() -> Self {
Self {
namespace: default_namespace(),
pod: default_pod(),
kv_mount: default_kv_mount(),
policies: Vec::new(),
users: Vec::new(),
jwt_auth: None,
}
}
}
impl<T: Topology + K8sclient> Score<T> for OpenbaoSetupScore {
fn name(&self) -> String {
"OpenbaoSetupScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<T>> {
Box::new(OpenbaoSetupInterpret {
score: self.clone(),
})
}
}
// ---------------------------------------------------------------------------
// Interpret
// ---------------------------------------------------------------------------
#[derive(Debug, Clone)]
struct OpenbaoSetupInterpret {
score: OpenbaoSetupScore,
}
#[derive(Debug, Serialize, Deserialize)]
struct InitOutput {
#[serde(rename = "unseal_keys_b64")]
keys: Vec<String>,
root_token: String,
}
fn keys_dir() -> PathBuf {
directories::BaseDirs::new()
.map(|dirs| dirs.data_dir().join("harmony").join("openbao"))
.unwrap_or_else(|| PathBuf::from("/tmp/harmony-openbao"))
}
fn keys_file() -> PathBuf {
keys_dir().join("unseal-keys.json")
}
impl OpenbaoSetupInterpret {
async fn exec(
&self,
k8s: &harmony_k8s::K8sClient,
command: Vec<&str>,
) -> Result<String, String> {
k8s.exec_pod_capture_output(&self.score.pod, Some(&self.score.namespace), command)
.await
}
async fn bao_command(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
shell_cmd: &str,
) -> Result<String, String> {
let full = format!("export VAULT_TOKEN={} && {}", root_token, shell_cmd);
self.exec(k8s, vec!["sh", "-c", &full]).await
}
async fn bao(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
args: &[&str],
) -> Result<String, String> {
self.bao_command(k8s, root_token, &args.join(" ")).await
}
// -- Step 1: Init ---------------------------------------------------------
async fn init(&self, k8s: &harmony_k8s::K8sClient) -> Result<String, InterpretError> {
let dir = keys_dir();
std::fs::create_dir_all(&dir).map_err(|e| {
InterpretError::new(format!("Failed to create keys directory {:?}: {}", dir, e))
})?;
let path = keys_file();
if path.exists() {
// Verify the vault is actually initialized before trusting cached keys.
// If the cluster was recreated, the vault has a fresh PVC but the local
// keys file is stale.
let status = self.exec(k8s, vec!["bao", "status", "-format=json"]).await;
let is_initialized = match &status {
Ok(stdout) => !stdout.contains("\"initialized\":false"),
Err(e) => !e.contains("not initialized"),
};
if is_initialized {
info!("[OpenbaoSetup] Already initialized, loading existing keys");
let content = std::fs::read_to_string(&path)
.map_err(|e| InterpretError::new(format!("Failed to read keys: {e}")))?;
let init: InitOutput = serde_json::from_str(&content)
.map_err(|e| InterpretError::new(format!("Failed to parse keys: {e}")))?;
return Ok(init.root_token);
}
warn!(
"[OpenbaoSetup] Vault not initialized but stale keys file exists, re-initializing"
);
let _ = std::fs::remove_file(&path);
}
info!("[OpenbaoSetup] Initializing OpenBao...");
let output = self
.exec(k8s, vec!["bao", "operator", "init", "-format=json"])
.await;
match output {
Ok(stdout) => {
let init: InitOutput = serde_json::from_str(&stdout).map_err(|e| {
InterpretError::new(format!("Failed to parse init output: {e}"))
})?;
let json = serde_json::to_string_pretty(&init)
.map_err(|e| InterpretError::new(format!("Failed to serialize keys: {e}")))?;
std::fs::write(&path, json).map_err(|e| {
InterpretError::new(format!("Failed to write keys to {:?}: {e}", path))
})?;
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let _ = std::fs::set_permissions(&path, std::fs::Permissions::from_mode(0o600));
}
info!("[OpenbaoSetup] Initialized, keys saved to {:?}", path);
Ok(init.root_token)
}
Err(e) if e.contains("already initialized") => Err(InterpretError::new(format!(
"OpenBao already initialized but no local keys file at {:?}. \
Delete the cluster or restore the keys file.",
path
))),
Err(e) => Err(InterpretError::new(format!(
"OpenBao operator init failed: {e}"
))),
}
}
// -- Step 2: Unseal -------------------------------------------------------
async fn unseal(&self, k8s: &harmony_k8s::K8sClient) -> Result<(), InterpretError> {
#[derive(Deserialize)]
struct Status {
sealed: bool,
}
// bao status exits 2 when sealed — treat exec error as "sealed"
let sealed = match self.exec(k8s, vec!["bao", "status", "-format=json"]).await {
Ok(stdout) => serde_json::from_str::<Status>(&stdout)
.map(|s| s.sealed)
.unwrap_or(true),
Err(_) => true,
};
if !sealed {
info!("[OpenbaoSetup] Already unsealed");
return Ok(());
}
info!("[OpenbaoSetup] Unsealing...");
let path = keys_file();
let content = std::fs::read_to_string(&path)
.map_err(|e| InterpretError::new(format!("Failed to read keys: {e}")))?;
let init: InitOutput = serde_json::from_str(&content)
.map_err(|e| InterpretError::new(format!("Failed to parse keys: {e}")))?;
for key in &init.keys[0..3] {
self.exec(k8s, vec!["bao", "operator", "unseal", key])
.await
.map_err(|e| InterpretError::new(format!("Unseal failed: {e}")))?;
}
info!("[OpenbaoSetup] Unsealed successfully");
Ok(())
}
// -- Step 3: Enable KV v2 -------------------------------------------------
async fn enable_kv(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
) -> Result<(), InterpretError> {
let mount = &self.score.kv_mount;
let _ = self
.bao(
k8s,
root_token,
&[
"bao",
"secrets",
"enable",
&format!("-path={mount}"),
"kv-v2",
],
)
.await; // ignore "already enabled"
Ok(())
}
// -- Step 4: Enable userpass auth -----------------------------------------
async fn enable_userpass(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
) -> Result<(), InterpretError> {
let _ = self
.bao(k8s, root_token, &["bao", "auth", "enable", "userpass"])
.await;
Ok(())
}
// -- Step 5: Policies -----------------------------------------------------
async fn apply_policies(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
) -> Result<(), InterpretError> {
for policy in &self.score.policies {
let escaped_hcl = policy.hcl.replace('\'', "'\\''");
let cmd = format!(
"printf '{}' | bao policy write {} -",
escaped_hcl, policy.name
);
self.bao_command(k8s, root_token, &cmd).await.map_err(|e| {
InterpretError::new(format!("Failed to create policy '{}': {e}", policy.name))
})?;
info!("[OpenbaoSetup] Policy '{}' applied", policy.name);
}
Ok(())
}
// -- Step 6: Users --------------------------------------------------------
async fn create_users(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
) -> Result<(), InterpretError> {
for user in &self.score.users {
let policies = user.policies.join(",");
self.bao(
k8s,
root_token,
&[
"bao",
"write",
&format!("auth/userpass/users/{}", user.username),
&format!("password={}", user.password),
&format!("policies={}", policies),
],
)
.await
.map_err(|e| {
InterpretError::new(format!("Failed to create user '{}': {e}", user.username))
})?;
info!(
"[OpenbaoSetup] User '{}' created (policies: {})",
user.username, policies
);
}
Ok(())
}
// -- Step 7: JWT auth -----------------------------------------------------
async fn configure_jwt(
&self,
k8s: &harmony_k8s::K8sClient,
root_token: &str,
) -> Result<(), InterpretError> {
let jwt = match &self.score.jwt_auth {
Some(j) => j,
None => return Ok(()),
};
let _ = self
.bao(k8s, root_token, &["bao", "auth", "enable", "jwt"])
.await;
// Configure JWT discovery. This may fail if the discovery URL is not
// reachable from inside the cluster (e.g., Zitadel's ExternalDomain
// isn't resolvable). Non-fatal — warn and continue.
let config_result = self
.bao(
k8s,
root_token,
&[
"bao",
"write",
"auth/jwt/config",
&format!("oidc_discovery_url={}", jwt.oidc_discovery_url),
&format!("bound_issuer={}", jwt.bound_issuer),
],
)
.await;
match config_result {
Ok(_) => {
info!(
"[OpenbaoSetup] JWT auth configured (issuer: {})",
jwt.bound_issuer
);
}
Err(e) => {
warn!(
"[OpenbaoSetup] JWT auth config failed (non-fatal): {}. \
Ensure '{}' resolves from inside the cluster.",
e, jwt.oidc_discovery_url
);
}
}
let policies = jwt.policies.join(",");
self.bao(
k8s,
root_token,
&[
"bao",
"write",
&format!("auth/jwt/role/{}", jwt.role_name),
"role_type=jwt",
&format!("bound_audiences={}", jwt.bound_audiences),
&format!("user_claim={}", jwt.user_claim),
&format!("policies={}", policies),
&format!("ttl={}", jwt.ttl),
&format!("max_ttl={}", jwt.max_ttl),
"token_type=service",
],
)
.await
.map_err(|e| {
InterpretError::new(format!(
"Failed to create JWT role '{}': {e}",
jwt.role_name
))
})?;
info!(
"[OpenbaoSetup] JWT role '{}' created (policies: {})",
jwt.role_name, policies
);
Ok(())
}
}
#[async_trait]
impl<T: Topology + K8sclient> Interpret<T> for OpenbaoSetupInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &T,
) -> Result<Outcome, InterpretError> {
let k8s = topology
.k8s_client()
.await
.map_err(|e| InterpretError::new(format!("Failed to get K8s client: {e}")))?;
// Wait for the pod to be running before attempting any operations.
k8s.wait_for_pod_ready(&self.score.pod, Some(&self.score.namespace))
.await
.map_err(|e| {
InterpretError::new(format!(
"Pod {}/{} not ready: {e}",
self.score.namespace, self.score.pod
))
})?;
let root_token = self.init(&k8s).await?;
self.unseal(&k8s).await?;
self.enable_kv(&k8s, &root_token).await?;
if !self.score.users.is_empty() {
self.enable_userpass(&k8s, &root_token).await?;
}
self.apply_policies(&k8s, &root_token).await?;
self.create_users(&k8s, &root_token).await?;
self.configure_jwt(&k8s, &root_token).await?;
let mut details = vec![
format!("root_token={}", root_token),
format!("kv_mount={}", self.score.kv_mount),
];
for user in &self.score.users {
details.push(format!("user={}", user.username));
}
Ok(Outcome {
status: InterpretStatus::SUCCESS,
message: "OpenBao initialized, unsealed, and configured".to_string(),
details,
})
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("OpenbaoSetup")
}
fn get_version(&self) -> Version {
todo!()
}
fn get_status(&self) -> InterpretStatus {
todo!()
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -0,0 +1,593 @@
//! OPNsense first-boot automation via HTTP session auth.
//!
//! Automates the manual steps required after booting a fresh OPNsense
//! installation: login, abort the initial setup wizard, enable SSH,
//! change the web GUI port, and create API credentials.
//!
//! This module talks directly to OPNsense's web UI and API using HTTP
//! requests with session cookie authentication — no browser needed.
//!
//! # Typical usage
//!
//! ```rust,no_run
//! use harmony::modules::opnsense::bootstrap::OPNsenseBootstrap;
//!
//! # async fn example() -> Result<(), Box<dyn std::error::Error>> {
//! let bootstrap = OPNsenseBootstrap::new("https://192.168.1.1");
//! bootstrap.login("root", "opnsense").await?;
//! bootstrap.abort_wizard().await?;
//! bootstrap.enable_ssh(true, true).await?;
//! bootstrap.set_webgui_port(9443, "192.168.1.1", true).await?;
//! # Ok(())
//! # }
//! ```
use log::{debug, info, warn};
/// Errors from bootstrap operations.
#[derive(Debug, thiserror::Error)]
pub enum BootstrapError {
#[error("HTTP request failed: {0}")]
Http(#[from] reqwest::Error),
#[error("Login failed: {reason}")]
LoginFailed { reason: String },
#[error("CSRF token not found in page response")]
CsrfNotFound,
#[error("Unexpected response: {0}")]
UnexpectedResponse(String),
}
/// Automates OPNsense first-boot setup via HTTP session auth.
///
/// Maintains a session cookie jar across requests, allowing authenticated
/// access to legacy PHP pages and the MVC API.
pub struct OPNsenseBootstrap {
base_url: String,
client: reqwest::Client,
}
impl OPNsenseBootstrap {
/// Create a new bootstrap client for an OPNsense instance.
///
/// The `base_url` should be the root URL (e.g., `https://192.168.1.1`).
/// TLS certificate verification is disabled since fresh OPNsense
/// installations use self-signed certificates.
pub fn new(base_url: &str) -> Self {
let client = reqwest::Client::builder()
.cookie_store(true)
.danger_accept_invalid_certs(true)
.redirect(reqwest::redirect::Policy::none())
.build()
.expect("Failed to build HTTP client");
Self {
base_url: base_url.trim_end_matches('/').to_string(),
client,
}
}
/// Log in to the OPNsense web UI with username and password.
///
/// Fetches the login page to obtain a CSRF token, then POSTs the
/// credentials. On success, the session cookie is stored in the
/// client's cookie jar for subsequent requests.
pub async fn login(&self, username: &str, password: &str) -> Result<(), BootstrapError> {
info!("Logging in to {} as {}", self.base_url, username);
// Step 1: GET the login page to get CSRF token and session cookie
let login_url = format!("{}/", self.base_url);
let resp = self.client.get(&login_url).send().await?;
// Log response cookies for debugging
for cookie_header in resp.headers().get_all("set-cookie") {
debug!("Login page Set-Cookie: {:?}", cookie_header);
}
let body = resp.text().await?;
let (csrf_name, csrf_value) = extract_csrf_token(&body)?;
debug!(
"Got CSRF token: {}={}...",
csrf_name,
&csrf_value[..csrf_value.len().min(8)]
);
// Step 2: POST login form
let form = [
("usernamefld", username),
("passwordfld", password),
("login", "1"),
(&csrf_name, &csrf_value),
];
debug!("POSTing login form to {}", login_url);
let resp = self.client.post(&login_url).form(&form).send().await?;
let status = resp.status();
debug!("Login POST returned status: {}", status);
if status.is_redirection() {
// 302 redirect = successful login
info!("Login successful (redirect to dashboard)");
// Follow the redirect to establish the full session
if let Some(location) = resp.headers().get("location") {
let redirect_url = location.to_str().unwrap_or("/");
let full_url = if redirect_url.starts_with('/') {
format!("{}{}", self.base_url, redirect_url)
} else {
redirect_url.to_string()
};
let _ = self.client.get(&full_url).send().await?;
}
Ok(())
} else {
let body = resp.text().await?;
if body.contains("Wrong username or password") {
Err(BootstrapError::LoginFailed {
reason: "Wrong username or password".to_string(),
})
} else {
Err(BootstrapError::LoginFailed {
reason: format!("Unexpected status {} after login POST", status),
})
}
}
}
/// Abort the initial setup wizard.
///
/// Calls `POST /api/core/initial_setup/abort` which removes the
/// `trigger_initial_wizard` flag from config.xml. Requires an
/// authenticated session (call `login()` first).
///
/// Safe to call even if the wizard has already been completed — it
/// simply returns `{"result": "done"}`.
pub async fn abort_wizard(&self) -> Result<(), BootstrapError> {
info!("Aborting initial setup wizard");
let url = format!("{}/api/core/initial_setup/abort", self.base_url);
let resp = self.client.post(&url).send().await?;
let status = resp.status();
let body = resp.text().await?;
if status.is_success() {
debug!("Wizard abort response: {}", body);
info!("Initial setup wizard aborted");
Ok(())
} else {
warn!("Wizard abort returned {}: {}", status, body);
// Non-fatal — the wizard may already have been completed
Ok(())
}
}
/// Enable or disable SSH with root login and password authentication.
///
/// POSTs to the legacy `system_advanced_admin.php` form. This also
/// preserves existing webgui settings (protocol, port, cert).
pub async fn enable_ssh(
&self,
permit_root_login: bool,
permit_password_auth: bool,
) -> Result<(), BootstrapError> {
info!(
"Enabling SSH (root_login={}, password_auth={})",
permit_root_login, permit_password_auth
);
// GET the admin page to get current settings + CSRF token
let url = format!("{}/system_advanced_admin.php", self.base_url);
let resp = self.client.get(&url).send().await?;
let body = resp.text().await?;
let (csrf_name, csrf_value) = extract_csrf_token(&body)?;
// Extract current webgui settings so we don't clobber them.
// webguiproto and ssl-certref are <select> dropdowns, not <input> fields.
let current_proto =
extract_selected_option(&body, "webguiproto").unwrap_or_else(|| "https".to_string());
let current_port = extract_input_value(&body, "webguiport").unwrap_or_default();
let current_certref = extract_selected_option(&body, "ssl-certref").unwrap_or_default();
// Preserve the current HTTP redirect setting so we don't clobber it.
let redirect_disabled =
body.contains("name=\"disablehttpredirect\"") && body.contains("checked=\"checked\"");
let mut form: Vec<(&str, String)> = vec![
("webguiproto", current_proto),
("webguiport", current_port),
("ssl-certref", current_certref),
("enablesshd", "yes".to_string()),
(&csrf_name, csrf_value),
("save", "Save".to_string()),
];
if redirect_disabled {
form.push(("disablehttpredirect", "yes".to_string()));
}
if permit_root_login {
form.push(("sshdpermitrootlogin", "yes".to_string()));
}
if permit_password_auth {
form.push(("sshpasswordauth", "yes".to_string()));
}
let resp = self.client.post(&url).form(&form).send().await?;
let status = resp.status();
if status.is_success() || status.is_redirection() {
info!("SSH enabled successfully");
Ok(())
} else {
let body = resp.text().await?;
Err(BootstrapError::UnexpectedResponse(format!(
"SSH enable failed with status {}: {}",
status,
&body[..body.len().min(200)]
)))
}
}
/// Change the web GUI port and optionally disable the HTTP redirect rule.
///
/// When `disable_http_redirect` is `true`, the automatic HTTP-to-HTTPS
/// redirect on port 80 is disabled. This is required when HAProxy needs
/// to bind on `0.0.0.0:80` (e.g., for CARP VIP setups).
///
/// After this call, the web UI will be available on the new port.
/// The bootstrap client's `base_url` is NOT updated — create a new
/// `OPNsenseBootstrap` or `OPNSenseFirewall` with the new port.
///
/// Because the lighttpd restart triggered by the form POST can be
/// unreliable (the server dies before the configd command executes),
/// this method also restarts the webgui via SSH as a safety net.
/// SSH must be enabled before calling this method.
pub async fn set_webgui_port(
&self,
port: u16,
ssh_ip: &str,
disable_http_redirect: bool,
) -> Result<(), BootstrapError> {
info!(
"Setting web GUI port to {} (disable_http_redirect={})",
port, disable_http_redirect
);
let url = format!("{}/system_advanced_admin.php", self.base_url);
let resp = self.client.get(&url).send().await?;
let body = resp.text().await?;
let (csrf_name, csrf_value) = extract_csrf_token(&body)?;
// ssl-certref is a <select> — extract the selected option's value
let current_certref = extract_selected_option(&body, "ssl-certref").unwrap_or_default();
debug!("Current ssl-certref: {:?}", current_certref);
// Check if SSH is currently enabled (checkbox with checked attribute)
let ssh_enabled =
body.contains("name=\"enablesshd\"") && body.contains("checked=\"checked\"");
let mut form: Vec<(&str, String)> = vec![
("webguiproto", "https".to_string()),
("webguiport", port.to_string()),
("ssl-certref", current_certref),
(&csrf_name, csrf_value),
("save", "Save".to_string()),
];
if disable_http_redirect {
form.push(("disablehttpredirect", "yes".to_string()));
}
if ssh_enabled {
form.push(("enablesshd", "yes".to_string()));
form.push(("sshdpermitrootlogin", "yes".to_string()));
form.push(("sshpasswordauth", "yes".to_string()));
}
// The POST may fail/timeout because the webserver restarts on port change.
// That's expected — we fire-and-forget.
match self.client.post(&url).form(&form).send().await {
Ok(resp) => {
let status = resp.status();
if status.is_success() || status.is_redirection() {
info!("Web GUI port changed to {} (server restarting)", port);
} else {
warn!(
"Web GUI port change returned status {} (may still succeed after restart)",
status
);
}
}
Err(e) => {
// Connection reset is expected when the port changes
if e.is_connect() || e.is_timeout() || e.is_request() {
info!(
"Web GUI port change submitted (connection lost during restart — expected)"
);
} else {
return Err(e.into());
}
}
}
// Safety net: explicitly restart webgui via SSH to ensure lighttpd
// comes back on the new port. The PHP form handler's configd call
// can fail if lighttpd dies before executing it.
info!("Restarting webgui via SSH to ensure port change takes effect...");
tokio::time::sleep(std::time::Duration::from_secs(3)).await;
Self::restart_webgui_via_ssh(ssh_ip).await?;
Ok(())
}
/// Restart the OPNsense web GUI via SSH using configctl.
async fn restart_webgui_via_ssh(ssh_ip: &str) -> Result<(), BootstrapError> {
use opnsense_config::config::{OPNsenseShell, SshCredentials, SshOPNSenseShell};
use std::sync::Arc;
let ssh_config = Arc::new(russh::client::Config {
inactivity_timeout: None,
..<_>::default()
});
let credentials = SshCredentials::Password {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let ip: std::net::IpAddr = ssh_ip
.parse()
.map_err(|e| BootstrapError::UnexpectedResponse(format!("Invalid SSH IP: {e}")))?;
let shell = SshOPNSenseShell::new((ip, 22), credentials, ssh_config);
match shell.exec("configctl webgui restart").await {
Ok(output) => {
info!("webgui restart via SSH: {}", output.trim());
Ok(())
}
Err(e) => {
warn!("webgui restart via SSH failed: {e} (may still come up)");
Ok(()) // Non-fatal — the webgui may restart on its own
}
}
}
/// Run a diagnostic check via SSH and report webgui status.
///
/// Useful for troubleshooting when the web UI doesn't come up after
/// a port change or restart.
pub async fn diagnose_via_ssh(ssh_ip: &str) -> Result<String, BootstrapError> {
use opnsense_config::config::{OPNsenseShell, SshCredentials, SshOPNSenseShell};
use std::sync::Arc;
let ssh_config = Arc::new(russh::client::Config {
inactivity_timeout: None,
..<_>::default()
});
let credentials = SshCredentials::Password {
username: "root".to_string(),
password: "opnsense".to_string(),
};
let ip: std::net::IpAddr = ssh_ip
.parse()
.map_err(|e| BootstrapError::UnexpectedResponse(format!("Invalid SSH IP: {e}")))?;
let shell = SshOPNSenseShell::new((ip, 22), credentials, ssh_config);
let mut report = String::new();
let cmds = [
("webgui config", "grep -A5 '<webgui>' /conf/config.xml"),
(
"listening ports",
"sockstat -l | grep -E '443|9443|lighttpd|php'",
),
("lighttpd process", "ps aux | grep lighttpd"),
("webgui status", "configctl webgui status"),
];
for (label, cmd) in cmds {
report.push_str(&format!("=== {} ===\n", label));
match shell.exec(cmd).await {
Ok(output) => report.push_str(&output),
Err(e) => report.push_str(&format!("ERROR: {}\n", e)),
}
report.push('\n');
}
Ok(report)
}
/// Wait for the web UI to become available at a URL.
///
/// Polls with GET requests until the server responds or the timeout
/// is reached.
pub async fn wait_for_ready(
url: &str,
timeout: std::time::Duration,
) -> Result<(), BootstrapError> {
let client = reqwest::Client::builder()
.danger_accept_invalid_certs(true)
.timeout(std::time::Duration::from_secs(5))
.build()?;
let start = std::time::Instant::now();
let mut attempt = 0;
while start.elapsed() < timeout {
attempt += 1;
match client.get(url).send().await {
Ok(_) => {
info!("OPNsense is ready at {} (attempt {})", url, attempt);
return Ok(());
}
Err(_) => {
if attempt % 6 == 0 {
debug!(
"Waiting for OPNsense at {} ({:.0}s elapsed)",
url,
start.elapsed().as_secs_f64()
);
}
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
}
}
}
Err(BootstrapError::UnexpectedResponse(format!(
"OPNsense at {} did not respond within {}s",
url,
timeout.as_secs()
)))
}
}
/// Extract the CSRF token field name and value from an OPNsense HTML page.
///
/// OPNsense embeds CSRF tokens as hidden inputs with a dynamic field name.
/// The token appears as: `<input type="hidden" name="<key>" value="<token>" ... />`
/// where the name is stored in `$_SESSION['$PHALCON/CSRF/KEY$']`.
fn extract_csrf_token(html: &str) -> Result<(String, String), BootstrapError> {
// OPNsense CSRF tokens are hidden inputs with a random name and value.
// Pattern: <input type="hidden" name="TOKEN_NAME" value="TOKEN_VALUE" .../>
//
// The form tag and hidden input may be on the same line, so we extract
// individual <input> tags rather than parsing whole lines.
//
// Find all <input ...> tags with type="hidden"
let mut pos = 0;
while let Some(start) = html[pos..].find("<input ") {
let abs_start = pos + start;
let tag_end = match html[abs_start..].find("/>") {
Some(e) => abs_start + e + 2,
None => match html[abs_start..].find('>') {
Some(e) => abs_start + e + 1,
None => break,
},
};
let tag = &html[abs_start..tag_end];
pos = tag_end;
if !tag.contains("type=\"hidden\"") {
continue;
}
if let (Some(name), Some(value)) = (extract_attr(tag, "name"), extract_attr(tag, "value")) {
// Skip known non-CSRF fields
if name == "login" || name == "usernamefld" || name == "passwordfld" {
continue;
}
// CSRF tokens are typically long random strings
if !name.is_empty() && !value.is_empty() && value.len() > 10 {
return Ok((name, value));
}
}
}
Err(BootstrapError::CsrfNotFound)
}
/// Extract an HTML attribute value from a tag string.
fn extract_attr(tag: &str, attr: &str) -> Option<String> {
let needle = format!("{}=\"", attr);
let start = tag.find(&needle)? + needle.len();
let rest = &tag[start..];
let end = rest.find('"')?;
Some(rest[..end].to_string())
}
/// Extract the selected value from a `<select name="...">` dropdown.
///
/// Looks for `<select name="NAME">` followed by `<option ... selected ...>`.
fn extract_selected_option(html: &str, name: &str) -> Option<String> {
let select_needle = format!("name=\"{}\"", name);
let select_pos = html.find(&select_needle)?;
// Find the closing </select> after this select
let rest = &html[select_pos..];
let end_select = rest.find("</select>")?;
let select_html = &rest[..end_select];
// Find <option ... selected ...> within this select
for segment in select_html.split("<option") {
if segment.contains("selected") {
// Extract value="..." from this option
if let Some(val) = extract_attr(segment, "value") {
return Some(val);
}
}
}
None
}
/// Extract the value of a named input field from HTML.
fn extract_input_value(html: &str, name: &str) -> Option<String> {
let needle = format!("name=\"{}\"", name);
for line in html.lines() {
let line = line.trim();
if line.contains(&needle) && line.contains("value=\"") {
return extract_attr(line, "value");
}
}
// Also check for <option selected> inside a <select name="...">
None
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn extract_csrf_from_login_page() {
let html = r#"
<form method="post">
<input type="text" name="usernamefld" />
<input type="password" name="passwordfld" />
<input type="hidden" name="a1b2c3d4e5f6" value="xYzAbCdEfGhIjKlMnOpQrS" autocomplete="off" />
<input type="submit" name="login" value="Login" />
</form>
"#;
let (name, value) = extract_csrf_token(html).unwrap();
assert_eq!(name, "a1b2c3d4e5f6");
assert_eq!(value, "xYzAbCdEfGhIjKlMnOpQrS");
}
#[test]
fn extract_csrf_from_single_line_form() {
// Real OPNsense HTML has form + hidden input on the same line
let html = r#"<form class="clearfix" id="iform" name="iform" method="post" autocomplete="off"><input type="hidden" name="yHFq2qSSEsCK67ivz0WQkg" value="8BocJxqMV1771lHNa4CbJw" autocomplete="new-password" />"#;
let (name, value) = extract_csrf_token(html).unwrap();
assert_eq!(name, "yHFq2qSSEsCK67ivz0WQkg");
assert_eq!(value, "8BocJxqMV1771lHNa4CbJw");
}
#[test]
fn extract_csrf_not_found_in_empty_html() {
let result = extract_csrf_token("<html></html>");
assert!(matches!(result, Err(BootstrapError::CsrfNotFound)));
}
#[test]
fn extract_attr_works() {
let tag = r#"<input type="hidden" name="csrf_key" value="token123" />"#;
assert_eq!(extract_attr(tag, "name"), Some("csrf_key".to_string()));
assert_eq!(extract_attr(tag, "value"), Some("token123".to_string()));
assert_eq!(extract_attr(tag, "type"), Some("hidden".to_string()));
assert_eq!(extract_attr(tag, "missing"), None);
}
#[test]
fn extract_input_value_works() {
let html = r#"
<input name="webguiport" value="443" />
<input name="webguiproto" value="https" />
"#;
assert_eq!(
extract_input_value(html, "webguiport"),
Some("443".to_string())
);
assert_eq!(
extract_input_value(html, "webguiproto"),
Some("https".to_string())
);
assert_eq!(extract_input_value(html, "missing"), None);
}
}

View File

@@ -0,0 +1,113 @@
use async_trait::async_trait;
use harmony_types::id::Id;
use log::info;
use serde::Serialize;
use crate::{
data::Version,
executors::ExecutorError,
infra::opnsense::OPNSenseFirewall,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
topology::Topology,
};
/// Desired state for OPNsense destination NAT (port forwarding) rules.
#[derive(Debug, Clone, Serialize)]
pub struct DnatScore {
pub rules: Vec<DnatRuleDef>,
}
use harmony_types::firewall::{IpProtocol, NetworkProtocol};
/// A single destination NAT rule definition.
#[derive(Debug, Clone, Serialize)]
pub struct DnatRuleDef {
/// Interface(s) to apply the rule on (e.g. "wan")
pub interface: String,
/// IP protocol version
pub ip_protocol: IpProtocol,
/// Network protocol
pub protocol: NetworkProtocol,
/// Destination address to match (external/public IP or "wanip")
pub destination: String,
/// Destination port to match
pub destination_port: String,
/// Internal target IP
pub target: String,
/// Internal target port (if different from destination_port)
pub local_port: Option<String>,
/// Description (used for idempotency matching)
pub description: String,
pub log: bool,
/// Create automatic firewall rule
pub register_rule: bool,
}
impl Score<OPNSenseFirewall> for DnatScore {
fn name(&self) -> String {
"DnatScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(DnatInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct DnatInterpret {
score: DnatScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for DnatInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let dnat = topology.get_opnsense_config().dnat();
for rule in &self.score.rules {
info!("Ensuring DNat rule: {}", rule.description);
dnat.ensure_dnat_rule(
&rule.interface,
&rule.ip_protocol,
&rule.protocol,
&rule.destination,
&rule.destination_port,
&rule.target,
rule.local_port.as_deref(),
&rule.description,
rule.register_rule,
rule.log,
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Ok(Outcome::success(format!(
"Configured {} DNat rules",
self.score.rules.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("DnatScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -0,0 +1,299 @@
use async_trait::async_trait;
use harmony_types::id::Id;
use log::info;
use serde::Serialize;
use crate::{
data::Version,
executors::ExecutorError,
infra::opnsense::OPNSenseFirewall,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
topology::Topology,
};
use harmony_types::firewall::{Direction, FirewallAction, IpProtocol, NetworkProtocol};
// ── Filter Rule Score ───────────────────────────────────────────────
/// Desired state for OPNsense new-generation firewall filter rules.
#[derive(Debug, Clone, Serialize)]
pub struct FirewallRuleScore {
pub rules: Vec<FilterRuleDef>,
}
/// A single firewall filter rule definition.
#[derive(Debug, Clone, Serialize)]
pub struct FilterRuleDef {
pub action: FirewallAction,
pub direction: Direction,
pub interface: String,
pub ip_protocol: IpProtocol,
pub protocol: NetworkProtocol,
pub source_net: String,
pub destination_net: String,
pub destination_port: Option<String>,
pub gateway: Option<String>,
pub description: String,
pub log: bool,
}
impl Score<OPNSenseFirewall> for FirewallRuleScore {
fn name(&self) -> String {
"FirewallRuleScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(FirewallRuleInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct FirewallRuleInterpret {
score: FirewallRuleScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for FirewallRuleInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let fw = topology.get_opnsense_config().firewall();
for rule in &self.score.rules {
info!("Ensuring firewall rule: {}", rule.description);
fw.ensure_filter_rule(
&rule.action,
&rule.direction,
&rule.interface,
&rule.ip_protocol,
&rule.protocol,
&rule.source_net,
&rule.destination_net,
rule.destination_port.as_deref(),
rule.gateway.as_deref(),
&rule.description,
rule.log,
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
fw.apply()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(Outcome::success(format!(
"Configured {} firewall rules",
self.score.rules.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("FirewallRuleScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}
// ── Outbound NAT Score ──────────────────────────────────────────────
/// Desired state for OPNsense outbound NAT (SNAT) rules.
#[derive(Debug, Clone, Serialize)]
pub struct OutboundNatScore {
pub rules: Vec<SnatRuleDef>,
}
/// A single SNAT rule definition.
#[derive(Debug, Clone, Serialize)]
pub struct SnatRuleDef {
pub interface: String,
pub ip_protocol: IpProtocol,
pub protocol: NetworkProtocol,
pub source_net: String,
pub destination_net: String,
pub target: String,
pub description: String,
pub log: bool,
pub nonat: bool,
}
impl Score<OPNSenseFirewall> for OutboundNatScore {
fn name(&self) -> String {
"OutboundNatScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(OutboundNatInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct OutboundNatInterpret {
score: OutboundNatScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for OutboundNatInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let fw = topology.get_opnsense_config().firewall();
for rule in &self.score.rules {
info!("Ensuring SNAT rule: {}", rule.description);
fw.ensure_snat_rule_from(
&rule.interface,
&rule.ip_protocol,
&rule.protocol,
&rule.source_net,
&rule.destination_net,
&rule.target,
&rule.description,
rule.log,
rule.nonat,
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
fw.apply_snat()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(Outcome::success(format!(
"Configured {} SNAT rules",
self.score.rules.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("OutboundNatScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}
// ── BINAT Score ─────────────────────────────────────────────────────
/// Desired state for OPNsense 1:1 NAT (BINAT) rules.
#[derive(Debug, Clone, Serialize)]
pub struct BinatScore {
pub rules: Vec<BinatRuleDef>,
}
/// A single 1:1 NAT rule definition.
#[derive(Debug, Clone, Serialize)]
pub struct BinatRuleDef {
pub interface: String,
pub source_net: String,
pub external: String,
pub description: String,
pub log: bool,
}
impl Score<OPNSenseFirewall> for BinatScore {
fn name(&self) -> String {
"BinatScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(BinatInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct BinatInterpret {
score: BinatScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for BinatInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let fw = topology.get_opnsense_config().firewall();
for rule in &self.score.rules {
info!("Ensuring BINAT rule: {}", rule.description);
// FIXME we should be used strong types here. opnsense api and/or opnsense config needs a major refactor to
// expose either generic functions allowing to pass correct method params, or specific
// methods for each endpoint with proper types. Using json here in the middle of a
// functions stack is a major wtf (wtf/minute being one of the best code quality
// metrics)
// This breaks flow, makes reading code with find references and goto definition
// totally useless.
let body = serde_json::json!({
"rule": {
"enabled": "1",
"interface": &rule.interface,
"type": "binat",
"source_net": &rule.source_net,
"external": &rule.external,
"log": if rule.log { "1" } else { "0" },
"description": &rule.description,
}
});
fw.ensure_binat_rule(&body)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
fw.apply_binat()
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
Ok(Outcome::success(format!(
"Configured {} BINAT rules",
self.score.rules.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("BinatScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -0,0 +1,286 @@
//! OPNsense image customization.
//!
//! OPNsense nano images ship with a default `config.xml` that references
//! e1000 interface names (`em0`, `em1`) and includes a
//! `trigger_initial_wizard` flag that blocks boot until manual console
//! interaction. For automated KVM deployments with virtio NICs (`vtnet0`,
//! `vtnet1`), we need to replace this config before first boot.
//!
//! # How this works
//!
//! The nano image is a raw disk with a nanobsd partition layout. The live
//! `config.xml` sits inside a UFS filesystem at a discoverable byte offset.
//! We locate it by scanning for `<opnsense>` preceded by `<?xml`, verify
//! the block boundaries, and overwrite in-place with a replacement config
//! that is null-padded to the exact same block size.
//!
//! # Supported images
//!
//! | Image | Verified offset | Block size |
//! |-------|----------------|------------|
//! | OPNsense-26.1-nano-amd64.img | 584962049 | 4096 |
//!
//! To add support for a new image version, run the `find_config_offset`
//! function against the raw image and verify the block boundaries.
use log::{debug, info};
use std::io::{Read, Seek, SeekFrom, Write};
use std::path::Path;
/// Known config.xml locations in OPNsense images.
struct KnownImage {
/// Substring that must appear in the image filename.
filename_pattern: &'static str,
/// Byte offset where the config.xml content starts (after `<?xml`).
xml_offset: u64,
/// Total block size allocated for the file (content + null padding).
block_size: usize,
}
const KNOWN_IMAGES: &[KnownImage] = &[KnownImage {
filename_pattern: "OPNsense-26.1-nano-amd64",
xml_offset: 584962049,
block_size: 4095,
}];
/// Errors from image customization.
#[derive(Debug, thiserror::Error)]
pub enum ImageError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Config XML too large: {size} bytes, max {max} bytes")]
ConfigTooLarge { size: usize, max: usize },
#[error(
"Unknown image: {filename}. Run find_config_offset() to discover the offset for this image."
)]
UnknownImage { filename: String },
#[error("Verification failed: expected config.xml at offset {offset}, found: {found}")]
VerificationFailed { offset: u64, found: String },
}
/// Replace the config.xml in an OPNsense nano image file.
///
/// The image file is modified **in place**. The new config is written at
/// the known offset and null-padded to fill the original block.
///
/// If the image filename matches a known image, the hardcoded offset is used.
/// Otherwise, the image is scanned to discover the config.xml location.
///
/// # Arguments
///
/// * `image_path` — Path to the raw `.img` file (NOT qcow2).
/// * `new_config_xml` — Complete XML content to write.
pub fn replace_config_xml(image_path: &Path, new_config_xml: &str) -> Result<(), ImageError> {
let filename = image_path
.file_name()
.and_then(|n| n.to_str())
.unwrap_or("");
let (xml_offset, block_size) = if let Some(known) = KNOWN_IMAGES
.iter()
.find(|k| filename.contains(k.filename_pattern))
{
info!(
"Using known offset for {}: offset={}, block={}",
known.filename_pattern, known.xml_offset, known.block_size
);
(known.xml_offset, known.block_size)
} else {
info!(
"Unknown image filename '{}', scanning for config.xml...",
filename
);
let found = find_config_offset(image_path)?.ok_or_else(|| ImageError::UnknownImage {
filename: filename.to_string(),
})?;
info!(
"Found config.xml at offset {}, block size {}",
found.0, found.2
);
(found.0, found.2)
};
// Verify the existing config is where we expect it
let mut file = std::fs::OpenOptions::new()
.read(true)
.write(true)
.open(image_path)?;
verify_existing_config(&mut file, xml_offset)?;
// Check size
let config_bytes = new_config_xml.as_bytes();
if config_bytes.len() > block_size {
return Err(ImageError::ConfigTooLarge {
size: config_bytes.len(),
max: block_size,
});
}
// Build the replacement: config content + null padding to fill the block
let mut block = vec![0u8; block_size];
block[..config_bytes.len()].copy_from_slice(config_bytes);
// Write at the discovered offset
file.seek(SeekFrom::Start(xml_offset))?;
file.write_all(&block)?;
file.flush()?;
info!(
"Config.xml replaced ({} bytes content, {} bytes null-padded)",
config_bytes.len(),
block_size - config_bytes.len()
);
// Verify the write
verify_existing_config(&mut file, xml_offset)?;
debug!("Post-write verification passed");
Ok(())
}
/// Verify that a config.xml exists at the expected offset.
fn verify_existing_config(file: &mut std::fs::File, offset: u64) -> Result<(), ImageError> {
file.seek(SeekFrom::Start(offset))?;
let mut header = [0u8; 30];
file.read_exact(&mut header)?;
let header_str = String::from_utf8_lossy(&header);
if !header_str.starts_with("<?xml") {
return Err(ImageError::VerificationFailed {
offset,
found: header_str.chars().take(20).collect(),
});
}
Ok(())
}
/// Generate a minimal OPNsense config.xml with WAN (DHCP) and LAN (static).
///
/// This config:
/// - Assigns WAN with DHCP (for internet access)
/// - Assigns LAN with a static IP (for management from host)
/// - Enables SSH with root login (default password: `opnsense`)
/// - Skips the initial wizard
pub fn minimal_config_xml(wan_if: &str, lan_if: &str, lan_ip: &str, lan_subnet: u8) -> String {
format!(
r#"<?xml version="1.0"?>
<opnsense>
<version>1</version>
<interfaces>
<wan>
<enable/>
<if>{wan_if}</if>
<descr>WAN</descr>
<ipaddr>dhcp</ipaddr>
<ipaddrv6>dhcp6</ipaddrv6>
<dhcp6-ia-pd-len>0</dhcp6-ia-pd-len>
<spoofmac/>
</wan>
<lan>
<enable>1</enable>
<if>{lan_if}</if>
<descr>LAN</descr>
<ipaddr>{lan_ip}</ipaddr>
<subnet>{lan_subnet}</subnet>
<ipaddrv6/>
<subnetv6/>
</lan>
</interfaces>
<system>
<hostname>opnsense</hostname>
<domain>localdomain</domain>
<dnsallowoverride>1</dnsallowoverride>
<group>
<name>admins</name>
<description>System Administrators</description>
<scope>system</scope>
<gid>1999</gid>
<member>0</member>
<priv>page-all</priv>
</group>
<user>
<name>root</name>
<descr>System Administrator</descr>
<scope>system</scope>
<groupname>admins</groupname>
<password>$2b$10$YRVoF4SgskIsrXOvOQjGieB9XqHPRra9R7d80B3BZdbY/j21TwBfS</password>
<uid>0</uid>
</user>
<ssh>
<enabled>enabled</enabled>
<permitrootlogin>1</permitrootlogin>
<passwordauth>1</passwordauth>
</ssh>
</system>
<filter>
<rule></rule>
</filter>
</opnsense>
"#
)
}
/// Scan a raw OPNsense image to discover the config.xml offset and block size.
///
/// This is a diagnostic tool — use it to find the values for `KNOWN_IMAGES`
/// when adding support for a new image version.
///
/// Returns `(offset, content_size, block_size)` or None if not found.
pub fn find_config_offset(image_path: &Path) -> Result<Option<(u64, usize, usize)>, ImageError> {
let mut file = std::fs::File::open(image_path)?;
let file_size = file.metadata()?.len();
let needle = b"<?xml version=\"1.0\"?>\n<opnsense>\n <version>";
let chunk_size: usize = 10 * 1024 * 1024; // 10MB chunks
let mut offset: u64 = 0;
info!("Scanning {} for config.xml...", image_path.display());
while offset < file_size {
file.seek(SeekFrom::Start(offset))?;
let read_size = chunk_size.min((file_size - offset) as usize) + needle.len();
let mut buf = vec![0u8; read_size];
let n = file.read(&mut buf)?;
buf.truncate(n);
if let Some(pos) = buf.windows(needle.len()).position(|w| w == needle) {
let abs_offset = offset + pos as u64;
info!("Found config.xml with <version> at offset {abs_offset}");
// Read the full content to find size
file.seek(SeekFrom::Start(abs_offset))?;
let mut content_buf = vec![0u8; 16384];
let n = file.read(&mut content_buf)?;
content_buf.truncate(n);
if let Some(end_pos) = content_buf
.windows(b"</opnsense>".len())
.position(|w| w == b"</opnsense>")
{
let content_size = end_pos + b"</opnsense>\n".len();
// Count null padding after
let mut null_count = 0;
for &b in &content_buf[content_size..] {
if b == 0 {
null_count += 1;
} else {
break;
}
}
let block_size = content_size + null_count;
info!(
"Config: offset={}, content={}B, block={}B ({}B null padding)",
abs_offset, content_size, block_size, null_count
);
return Ok(Some((abs_offset, content_size, block_size)));
}
}
offset += chunk_size as u64;
}
Ok(None)
}

View File

@@ -0,0 +1,102 @@
use async_trait::async_trait;
use harmony_types::id::Id;
use log::info;
use serde::Serialize;
use crate::{
data::Version,
executors::ExecutorError,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
};
use crate::infra::opnsense::OPNSenseFirewall;
/// Desired state for link aggregation groups on an OPNsense firewall.
///
/// TODO : this score creates a new LAGG interface every time even if it already exists. Not
/// idempotent.
#[derive(Debug, Clone, Serialize)]
pub struct LaggScore {
pub laggs: Vec<LaggDef>,
}
use harmony_types::firewall::LaggProtocol;
/// A single LAGG definition.
#[derive(Debug, Clone, Serialize)]
pub struct LaggDef {
pub members: Vec<String>,
pub protocol: LaggProtocol,
pub description: String,
pub mtu: Option<u16>,
pub lacp_fast_timeout: bool,
}
impl Score<OPNSenseFirewall> for LaggScore {
fn name(&self) -> String {
"LaggScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(LaggInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct LaggInterpret {
score: LaggScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for LaggInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let config = topology.get_opnsense_config();
for lagg in &self.score.laggs {
info!(
"Ensuring LAGG with members {:?}, protocol {}",
lagg.members, lagg.protocol
);
config
.lagg()
.ensure_lagg(
&lagg.members,
&lagg.protocol,
&lagg.description,
lagg.mtu,
lagg.lacp_fast_timeout,
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Ok(Outcome::success(format!(
"Configured {} LAGGs",
self.score.laggs.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("LaggScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -1,5 +1,12 @@
pub mod bootstrap;
pub mod dnat;
pub mod firewall;
pub mod image;
pub mod lagg;
pub mod node_exporter;
mod shell;
mod upgrade;
pub mod vip;
pub mod vlan;
pub use shell::*;
pub use upgrade::*;

View File

@@ -2,7 +2,6 @@ use std::sync::Arc;
use async_trait::async_trait;
use serde::Serialize;
use tokio::sync::RwLock;
use crate::{
data::Version,
@@ -15,15 +14,9 @@ use harmony_types::id::Id;
#[derive(Debug, Clone)]
pub struct OPNsenseShellCommandScore {
// TODO I am pretty sure we should not hold a direct reference to the
// opnsense_config::Config here.
// This causes a problem with serialization but also could cause many more problems as this
// is mixing concerns of configuration (which is the Responsibility of Scores to define)
// and state/execution which is the responsibility of interprets via topologies to manage
//
// I feel like a better solution would be for this Score/Interpret to require
// Topology + OPNSenseShell trait bindings
pub opnsense: Arc<RwLock<opnsense_config::Config>>,
// TODO: This should use a Topology + OPNSenseShell trait binding instead
// of holding a direct reference to Config.
pub opnsense: Arc<opnsense_config::Config>,
pub command: String,
}
@@ -62,13 +55,7 @@ impl<T: Topology> Interpret<T> for OPNsenseShellInterpret {
_inventory: &Inventory,
_topology: &T,
) -> Result<Outcome, InterpretError> {
let output = self
.score
.opnsense
.read()
.await
.run_command(&self.score.command)
.await?;
let output = self.score.opnsense.run_command(&self.score.command).await?;
Ok(Outcome::success(format!(
"Command execution successful : {}\n\n{output}",

View File

@@ -1,7 +1,6 @@
use std::sync::Arc;
use serde::Serialize;
use tokio::sync::RwLock;
use crate::{
interpret::{Interpret, InterpretStatus},
@@ -13,7 +12,7 @@ use super::{OPNsenseShellCommandScore, OPNsenseShellInterpret};
#[derive(Debug, Clone)]
pub struct OPNSenseLaunchUpgrade {
pub opnsense: Arc<RwLock<opnsense_config::Config>>,
pub opnsense: Arc<opnsense_config::Config>,
}
impl Serialize for OPNSenseLaunchUpgrade {

View File

@@ -0,0 +1,115 @@
use async_trait::async_trait;
use harmony_types::id::Id;
use log::info;
use serde::Serialize;
use crate::{
data::Version,
executors::ExecutorError,
infra::opnsense::OPNSenseFirewall,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
topology::Topology,
};
use harmony_types::firewall::VipMode;
/// Desired state for Virtual IPs (CARP, IP alias, ProxyARP) on OPNsense.
#[derive(Debug, Clone, Serialize)]
pub struct VipScore {
pub vips: Vec<VipDef>,
}
/// A single Virtual IP definition.
#[derive(Debug, Clone, Serialize)]
pub struct VipDef {
/// VIP mode
pub mode: VipMode,
/// Interface to bind to (e.g. "lan", "wan", "opt1")
pub interface: String,
/// IP address
pub subnet: String,
/// Subnet mask bits (e.g. 32 for a single IP, 24 for a /24)
pub subnet_bits: u8,
/// CARP VHID (1-255, required for CARP mode)
pub vhid: Option<u16>,
/// CARP advertisement base (1-254, default 1)
pub advbase: Option<u16>,
/// CARP advertisement skew (0-254, default 0, higher = lower priority)
pub advskew: Option<u16>,
/// CARP password (shared between primary and backup)
pub password: Option<String>,
/// Peer IP for CARP
pub peer: Option<String>,
}
impl Score<OPNSenseFirewall> for VipScore {
fn name(&self) -> String {
"VipScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(VipInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct VipInterpret {
score: VipScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for VipInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let vip_config = topology.get_opnsense_config().vip();
for vip in &self.score.vips {
info!(
"Ensuring VIP {} on {} ({})",
vip.subnet, vip.interface, vip.mode
);
vip_config
.ensure_vip_from(
&vip.mode,
&vip.interface,
&vip.subnet,
vip.subnet_bits,
vip.vhid,
vip.advbase,
vip.advskew,
vip.password.as_deref(),
vip.peer.as_deref(),
)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Ok(Outcome::success(format!(
"Configured {} VIPs",
self.score.vips.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("VipScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -0,0 +1,87 @@
use async_trait::async_trait;
use harmony_types::id::Id;
use log::info;
use serde::Serialize;
use crate::{
data::Version,
executors::ExecutorError,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
};
use crate::infra::opnsense::OPNSenseFirewall;
/// Desired state for VLANs on an OPNsense firewall.
/// FIXME this is not idempotent
#[derive(Debug, Clone, Serialize)]
pub struct VlanScore {
pub vlans: Vec<VlanDef>,
}
/// A single VLAN definition.
#[derive(Debug, Clone, Serialize)]
pub struct VlanDef {
pub parent_interface: String,
pub tag: u16,
pub description: String,
}
impl Score<OPNSenseFirewall> for VlanScore {
fn name(&self) -> String {
"VlanScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<OPNSenseFirewall>> {
Box::new(VlanInterpret {
score: self.clone(),
})
}
}
#[derive(Debug)]
struct VlanInterpret {
score: VlanScore,
}
#[async_trait]
impl Interpret<OPNSenseFirewall> for VlanInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &OPNSenseFirewall,
) -> Result<Outcome, InterpretError> {
let config = topology.get_opnsense_config();
for vlan in &self.score.vlans {
info!("Ensuring VLAN {} on {}", vlan.tag, vlan.parent_interface);
config
.vlan()
.ensure_vlan(&vlan.parent_interface, vlan.tag, &vlan.description)
.await
.map_err(|e| ExecutorError::UnexpectedError(e.to_string()))?;
}
Ok(Outcome::success(format!(
"Configured {} VLANs",
self.score.vlans.len()
)))
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("VlanScore")
}
fn get_version(&self) -> Version {
Version::from("1.0.0").unwrap()
}
fn get_status(&self) -> InterpretStatus {
InterpretStatus::QUEUED
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -36,6 +36,10 @@ pub struct PostgreSQLConfig {
/// **Note :** on OpenShfit based clusters, the namespace `default` has security
/// settings incompatible with the default CNPG behavior.
pub namespace: String,
/// When `true`, the interpret waits for the cluster's `-rw` service to
/// exist before returning. This ensures that callers like `get_endpoint()`
/// won't fail with "service not found" immediately after deployment.
pub wait_for_ready: bool,
}
impl PostgreSQLConfig {
@@ -54,6 +58,7 @@ impl Default for PostgreSQLConfig {
storage_size: StorageSize::gi(1),
role: PostgreSQLClusterRole::Primary,
namespace: "harmony".to_string(),
wait_for_ready: true,
}
}
}

View File

@@ -29,6 +29,7 @@ impl<T: PostgreSQL + TlsRouter> PostgreSQL for FailoverTopology<T> {
storage_size: config.storage_size.clone(),
role: PostgreSQLClusterRole::Primary,
namespace: config.namespace.clone(),
wait_for_ready: config.wait_for_ready,
};
info!(
@@ -145,6 +146,7 @@ impl<T: PostgreSQL + TlsRouter> PostgreSQL for FailoverTopology<T> {
storage_size: config.storage_size.clone(),
role: PostgreSQLClusterRole::Replica(replica_cluster_config),
namespace: config.namespace.clone(),
wait_for_ready: config.wait_for_ready,
};
info!(

View File

@@ -16,9 +16,10 @@ use async_trait::async_trait;
use harmony_k8s::KubernetesDistribution;
use harmony_types::id::Id;
use k8s_openapi::ByteString;
use k8s_openapi::api::core::v1::Service;
use k8s_openapi::api::core::v1::{Pod, Secret};
use k8s_openapi::apimachinery::pkg::apis::meta::v1::ObjectMeta;
use log::{info, warn};
use log::{debug, info, warn};
use serde::Serialize;
/// Deploys an opinionated, highly available PostgreSQL cluster managed by CNPG.
@@ -185,9 +186,78 @@ impl K8sPostgreSQLInterpret {
.await
.map_err(|e| InterpretError::new(format!("CNPG operator not ready: {}", e)))?;
// The deployment being ready doesn't mean the CRD is registered in the
// API server's discovery cache. Wait for it explicitly to avoid
// "Cannot resolve GVK" errors when applying Cluster resources.
k8s_client
.wait_for_crd(
"clusters.postgresql.cnpg.io",
Some(std::time::Duration::from_secs(60)),
)
.await
.map_err(|e| InterpretError::new(format!("CNPG Cluster CRD not registered: {}", e)))?;
// Invalidate the API discovery cache so the next apply() call sees the
// newly registered CRD. Without this, the cached discovery (populated
// before the CRD existed) would cause "Cannot resolve GVK" errors.
k8s_client.invalidate_discovery().await;
info!("CNPG operator is ready");
Ok(())
}
/// Waits for the cluster's `-rw` service to exist, indicating the primary
/// pod is running and the CNPG operator has created the service.
async fn wait_for_rw_service<T: Topology + K8sclient>(
&self,
topology: &T,
) -> Result<(), InterpretError> {
let k8s_client = topology
.k8s_client()
.await
.map_err(|e| InterpretError::new(format!("Failed to get k8s client: {}", e)))?;
let service_name = format!("{}-rw", self.config.cluster_name);
let namespace = &self.config.namespace;
let timeout = std::time::Duration::from_secs(120);
let start = std::time::Instant::now();
info!(
"Waiting for service '{}/{}' (up to {}s)...",
namespace,
service_name,
timeout.as_secs()
);
loop {
match k8s_client
.get_resource::<Service>(&service_name, Some(namespace))
.await
{
Ok(Some(_)) => {
info!("Service '{}/{}' is ready", namespace, service_name);
return Ok(());
}
Ok(None) => {
debug!("Service '{service_name}' not yet created");
}
Err(e) => {
debug!("Error checking service '{service_name}': {e}");
}
}
if start.elapsed() > timeout {
return Err(InterpretError::new(format!(
"Timed out waiting for service '{}/{}' after {}s",
namespace,
service_name,
timeout.as_secs()
)));
}
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
}
}
#[async_trait]
@@ -226,12 +296,17 @@ impl<T: Topology + K8sclient + HelmCommand + 'static> Interpret<T> for K8sPostgr
};
let cluster = Cluster { metadata, spec };
Ok(
let outcome =
K8sResourceScore::single(cluster, Some(self.config.namespace.clone()))
.create_interpret()
.execute(inventory, topology)
.await?,
)
.await?;
if self.config.wait_for_ready {
self.wait_for_rw_service(topology).await?;
}
Ok(outcome)
}
super::capability::PostgreSQLClusterRole::Replica(replica_config) => {
let metadata = ObjectMeta {

View File

@@ -1,3 +1,10 @@
pub mod setup;
pub use setup::{
ZitadelAppType, ZitadelApplication, ZitadelClientConfig, ZitadelMachineUser, ZitadelSetupScore,
};
use harmony_k8s::KubernetesDistribution;
use k8s_openapi::api::core::v1::Namespace;
use k8s_openapi::apimachinery::pkg::apis::meta::v1::ObjectMeta;
use k8s_openapi::{ByteString, api::core::v1::Secret};
@@ -48,7 +55,6 @@ const MASTERKEY_SECRET_NAME: &str = "zitadel-masterkey";
/// - NGINX: `nginx.ingress.kubernetes.io/backend-protocol: GRPC`
/// - OpenShift HAProxy: `route.openshift.io/termination: edge`
/// - AWS ALB: set `ingress.controller: aws`
///
/// # Database credentials
/// CNPG creates a `<cluster>-superuser` secret with key `password`. Because
@@ -63,6 +69,20 @@ pub struct ZitadelScore {
/// External domain (e.g. `"auth.example.com"`).
pub host: String,
pub zitadel_version: String,
/// Set to false for local k3d development (uses HTTP instead of HTTPS).
/// Defaults to true for production deployments.
#[serde(default)]
pub external_secure: bool,
}
impl Default for ZitadelScore {
fn default() -> Self {
Self {
host: Default::default(),
zitadel_version: "v4.12.1".to_string(),
external_secure: true,
}
}
}
impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Score<T> for ZitadelScore {
@@ -75,6 +95,7 @@ impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Score<T> for ZitadelSco
Box::new(ZitadelInterpret {
host: self.host.clone(),
zitadel_version: self.zitadel_version.clone(),
external_secure: self.external_secure,
})
}
}
@@ -85,6 +106,7 @@ impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Score<T> for ZitadelSco
struct ZitadelInterpret {
host: String,
zitadel_version: String,
external_secure: bool,
}
#[async_trait]
@@ -121,6 +143,7 @@ impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Interpret<T> for Zitade
storage_size: StorageSize::gi(10),
role: PostgreSQLClusterRole::Primary,
namespace: NAMESPACE.to_string(),
wait_for_ready: true,
};
debug!(
@@ -217,12 +240,25 @@ impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Interpret<T> for Zitade
let mut shuffled = chars;
shuffled.shuffle(&mut rng);
return shuffled.iter().collect();
shuffled.iter().collect()
}
let admin_password = generate_secure_password(16);
// --- Step 3: Create masterkey secret ------------------------------------
// --- Step 3: Get k8s client and detect distribution -------------------
let k8s_client = topology
.k8s_client()
.await
.map_err(|e| InterpretError::new(format!("Failed to get k8s client: {e}")))?;
let distro = k8s_client.get_k8s_distribution().await.map_err(|e| {
InterpretError::new(format!("Failed to detect k8s distribution: {}", e))
})?;
info!("[Zitadel] Detected Kubernetes distribution: {:?}", distro);
// --- Step 4: Create masterkey secret ------------------------------------
debug!(
"[Zitadel] Creating masterkey secret '{}' in namespace '{}'",
@@ -254,13 +290,7 @@ impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Interpret<T> for Zitade
..Secret::default()
};
match topology
.k8s_client()
.await
.map_err(|e| InterpretError::new(format!("Failed to get k8s client : {e}")))?
.create(&masterkey_secret, Some(NAMESPACE))
.await
{
match k8s_client.create(&masterkey_secret, Some(NAMESPACE)).await {
Ok(_) => {
info!(
"[Zitadel] Masterkey secret '{}' created",
@@ -288,16 +318,15 @@ impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Interpret<T> for Zitade
MASTERKEY_SECRET_NAME
);
// --- Step 4: Build Helm values ------------------------------------
// --- Step 5: Build Helm values ------------------------------------
warn!(
"[Zitadel] Applying TLS-enabled ingress defaults for OKD/OpenShift. \
cert-manager annotations are included as optional hints and are \
ignored on clusters without cert-manager."
);
let values_yaml = format!(
r#"image:
let values_yaml = match distro {
KubernetesDistribution::OpenshiftFamily => {
warn!(
"[Zitadel] Applying OpenShift-specific ingress with TLS and cert-manager annotations."
);
format!(
r#"image:
tag: {zitadel_version}
zitadel:
masterkeySecretName: "{MASTERKEY_SECRET_NAME}"
@@ -330,8 +359,6 @@ zitadel:
Username: postgres
SSL:
Mode: require
# Directly import credentials from the postgres secret
# TODO : use a less privileged postgres user
env:
- name: ZITADEL_DATABASE_POSTGRES_USER_USERNAME
valueFrom:
@@ -353,7 +380,6 @@ env:
secretKeyRef:
name: "{pg_superuser_secret}"
key: password
# Security context for OpenShift restricted PSA compliance
podSecurityContext:
runAsNonRoot: true
runAsUser: null
@@ -370,7 +396,6 @@ securityContext:
fsGroup: null
seccompProfile:
type: RuntimeDefault
# Init job security context (runs before main deployment)
initJob:
podSecurityContext:
runAsNonRoot: true
@@ -388,7 +413,6 @@ initJob:
fsGroup: null
seccompProfile:
type: RuntimeDefault
# Setup job security context
setupJob:
podSecurityContext:
runAsNonRoot: true
@@ -417,10 +441,9 @@ ingress:
- path: /
pathType: Prefix
tls:
- hosts:
- secretName: zitadel-tls
hosts:
- "{host}"
secretName: "{host}-tls"
login:
enabled: true
podSecurityContext:
@@ -450,15 +473,170 @@ login:
- path: /ui/v2/login
pathType: Prefix
tls:
- hosts:
- "{host}"
secretName: "{host}-tls""#,
zitadel_version = self.zitadel_version
);
- secretName: zitadel-login-tls
hosts:
- "{host}""#,
zitadel_version = self.zitadel_version,
host = host,
db_host = db_host,
db_port = db_port,
admin_password = admin_password,
pg_superuser_secret = pg_superuser_secret
)
}
KubernetesDistribution::K3sFamily | KubernetesDistribution::Default => {
warn!("[Zitadel] Applying k3d/generic ingress without TLS (HTTP only).");
// The Zitadel image defines User: "zitadel" (non-numeric).
// With runAsNonRoot: true, kubelet needs a numeric UID to verify
// the user is non-root. The "zitadel" user maps to UID 1000.
format!(
r#"image:
tag: {zitadel_version}
zitadel:
masterkeySecretName: "{MASTERKEY_SECRET_NAME}"
configmapConfig:
ExternalDomain: "{host}"
ExternalSecure: false
FirstInstance:
Org:
Human:
UserName: "admin"
Password: "{admin_password}"
FirstName: "Zitadel"
LastName: "Admin"
Email: "admin@zitadel.example.com"
PasswordChangeRequired: true
TLS:
Enabled: false
Database:
Postgres:
Host: "{db_host}"
Port: {db_port}
Database: zitadel
MaxOpenConns: 20
MaxIdleConns: 10
User:
Username: postgres
SSL:
Mode: require
Admin:
Username: postgres
SSL:
Mode: require
env:
- name: ZITADEL_DATABASE_POSTGRES_USER_USERNAME
valueFrom:
secretKeyRef:
name: "{pg_superuser_secret}"
key: user
- name: ZITADEL_DATABASE_POSTGRES_USER_PASSWORD
valueFrom:
secretKeyRef:
name: "{pg_superuser_secret}"
key: password
- name: ZITADEL_DATABASE_POSTGRES_ADMIN_USERNAME
valueFrom:
secretKeyRef:
name: "{pg_superuser_secret}"
key: user
- name: ZITADEL_DATABASE_POSTGRES_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: "{pg_superuser_secret}"
key: password
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
initJob:
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
setupJob:
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
ingress:
enabled: true
hosts:
- host: "{host}"
paths:
- path: /
pathType: Prefix
tls: []
login:
enabled: true
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
ingress:
enabled: true
hosts:
- host: "{host}"
paths:
- path: /ui/v2/login
pathType: Prefix
tls: []"#,
zitadel_version = self.zitadel_version,
host = host,
db_host = db_host,
db_port = db_port,
admin_password = admin_password,
pg_superuser_secret = pg_superuser_secret
)
}
};
trace!("[Zitadel] Helm values YAML:\n{values_yaml}");
// --- Step 5: Deploy Helm chart ------------------------------------
// --- Step 6: Deploy Helm chart ------------------------------------
info!(
"[Zitadel] Deploying Helm chart 'zitadel/zitadel' as release 'zitadel' in namespace '{NAMESPACE}'"
@@ -482,17 +660,25 @@ login:
.interpret(inventory, topology)
.await;
let protocol = if self.external_secure {
"https"
} else {
"http"
};
match &result {
Ok(_) => info!(
"[Zitadel] Helm chart deployed successfully\n\n\
===== ZITADEL DEPLOYMENT COMPLETE =====\n\
Login URL: https://{host}\n\
Username: admin@zitadel.{host}\n\
Login URL: {protocol}://{host}\n\
Username: admin\n\
Password: {admin_password}\n\n\
IMPORTANT: The password is saved in ConfigMap 'zitadel-config-yaml'\n\
and must be changed on first login. Save the credentials in a\n\
secure location after changing them.\n\
========================================="
=========================================",
protocol = protocol,
host = self.host,
admin_password = admin_password
),
Err(e) => error!("[Zitadel] Helm chart deployment failed: {e}"),
}

View File

@@ -0,0 +1,478 @@
use std::path::PathBuf;
use async_trait::async_trait;
use log::{debug, info, warn};
use serde::{Deserialize, Serialize};
use crate::{
data::Version,
interpret::{Interpret, InterpretError, InterpretName, InterpretStatus, Outcome},
inventory::Inventory,
score::Score,
topology::{K8sclient, Topology},
};
use harmony_types::id::Id;
const ADMIN_PAT_SECRET: &str = "iam-admin-pat";
const ZITADEL_NAMESPACE: &str = "zitadel";
/// Type of OIDC application to create.
#[derive(Debug, Clone, Serialize)]
pub enum ZitadelAppType {
/// OAuth 2.0 Device Authorization Grant (RFC 8628).
/// For CLI tools, SSH sessions, containers, and headless environments.
DeviceCode,
}
/// An OIDC application to create in a Zitadel project.
#[derive(Debug, Clone, Serialize)]
pub struct ZitadelApplication {
pub project_name: String,
pub app_name: String,
pub app_type: ZitadelAppType,
}
/// A machine user for service-to-service automation.
#[derive(Debug, Clone, Serialize)]
pub struct ZitadelMachineUser {
pub username: String,
pub name: String,
/// If true, creates a Personal Access Token and includes it in the Outcome details.
pub create_pat: bool,
}
/// Score that provisions identity resources in a deployed Zitadel instance.
///
/// This is the "day two" counterpart to [`ZitadelScore`] (which handles Helm
/// deployment). It creates projects, OIDC applications, and machine users
/// via Zitadel's Management API, authenticated with the admin PAT from the
/// `iam-admin-pat` K8s secret (provisioned by the Helm chart).
///
/// All operations are idempotent: existing resources are detected and skipped.
/// The `client_id` for created applications is cached locally at
/// `~/.local/share/harmony/zitadel/client-config.json`.
#[derive(Debug, Clone, Serialize)]
pub struct ZitadelSetupScore {
/// Zitadel instance hostname (must match the ZitadelScore's `host`).
pub host: String,
/// HTTP port for the Zitadel API (default: 8080 for k3d).
pub port: u16,
/// Whether to skip TLS verification (default: true for local dev).
pub skip_tls: bool,
/// OIDC applications to create.
#[serde(default)]
pub applications: Vec<ZitadelApplication>,
/// Machine users to create.
#[serde(default)]
pub machine_users: Vec<ZitadelMachineUser>,
}
/// Cached Zitadel provisioning results.
#[derive(Debug, Serialize, Deserialize)]
pub struct ZitadelClientConfig {
pub project_id: Option<String>,
pub apps: std::collections::HashMap<String, String>, // app_name -> client_id
}
impl ZitadelClientConfig {
fn cache_path() -> PathBuf {
directories::BaseDirs::new()
.map(|dirs| {
dirs.data_dir()
.join("harmony")
.join("zitadel")
.join("client-config.json")
})
.unwrap_or_else(|| PathBuf::from("/tmp/harmony-zitadel-client-config.json"))
}
pub fn load() -> Option<Self> {
let path = Self::cache_path();
std::fs::read_to_string(&path)
.ok()
.and_then(|s| serde_json::from_str(&s).ok())
}
fn save(&self) -> Result<(), String> {
let path = Self::cache_path();
if let Some(parent) = path.parent() {
std::fs::create_dir_all(parent)
.map_err(|e| format!("Failed to create cache dir: {e}"))?;
}
let json = serde_json::to_string_pretty(self)
.map_err(|e| format!("Failed to serialize config: {e}"))?;
std::fs::write(&path, json).map_err(|e| format!("Failed to write cache: {e}"))?;
Ok(())
}
/// Get the client_id for a named application (from cache).
pub fn client_id(&self, app_name: &str) -> Option<&String> {
self.apps.get(app_name)
}
}
impl<T: Topology + K8sclient> Score<T> for ZitadelSetupScore {
fn name(&self) -> String {
"ZitadelSetupScore".to_string()
}
fn create_interpret(&self) -> Box<dyn Interpret<T>> {
Box::new(ZitadelSetupInterpret {
score: self.clone(),
})
}
}
// ---------------------------------------------------------------------------
// Interpret
// ---------------------------------------------------------------------------
#[derive(Debug, Clone)]
struct ZitadelSetupInterpret {
score: ZitadelSetupScore,
}
#[derive(Deserialize)]
struct ProjectResponse {
id: String,
}
#[derive(Deserialize)]
struct AppResponse {
#[serde(rename = "clientId")]
client_id: Option<String>,
}
#[derive(Deserialize)]
struct ProjectSearchResult {
result: Option<Vec<ProjectSearchEntry>>,
}
#[derive(Deserialize)]
struct ProjectSearchEntry {
id: String,
name: String,
}
#[derive(Deserialize)]
struct AppSearchResult {
result: Option<Vec<AppSearchEntry>>,
}
#[derive(Deserialize)]
struct AppSearchEntry {
#[allow(dead_code)]
id: String,
name: String,
#[serde(rename = "oidcConfig")]
oidc_config: Option<OidcConfig>,
}
#[derive(Deserialize)]
struct OidcConfig {
#[serde(rename = "clientId")]
client_id: Option<String>,
}
impl ZitadelSetupInterpret {
fn api_url(&self, path: &str) -> String {
format!("http://127.0.0.1:{}{}", self.score.port, path)
}
fn http_client(&self) -> Result<reqwest::Client, String> {
let mut builder = reqwest::Client::builder();
if self.score.skip_tls {
builder = builder.danger_accept_invalid_certs(true);
}
builder
.build()
.map_err(|e| format!("Failed to build HTTP client: {e}"))
}
async fn read_admin_pat(&self, k8s: &harmony_k8s::K8sClient) -> Result<String, InterpretError> {
use k8s_openapi::api::core::v1::Secret;
let secret = k8s
.get_resource::<Secret>(ADMIN_PAT_SECRET, Some(ZITADEL_NAMESPACE))
.await
.map_err(|e| InterpretError::new(format!("Failed to get {ADMIN_PAT_SECRET}: {e}")))?
.ok_or_else(|| {
InterpretError::new(format!(
"Secret '{ADMIN_PAT_SECRET}' not found in namespace '{ZITADEL_NAMESPACE}'"
))
})?;
let data = secret.data.ok_or_else(|| {
InterpretError::new(format!("Secret '{ADMIN_PAT_SECRET}' has no data"))
})?;
let pat_bytes = data.get("pat").ok_or_else(|| {
InterpretError::new(format!("Secret '{ADMIN_PAT_SECRET}' has no 'pat' key"))
})?;
let pat = String::from_utf8(pat_bytes.0.clone())
.map_err(|e| InterpretError::new(format!("PAT is not valid UTF-8: {e}")))?;
Ok(pat.trim().to_string())
}
async fn find_project(
&self,
client: &reqwest::Client,
pat: &str,
name: &str,
) -> Result<Option<String>, String> {
let resp = client
.post(self.api_url("/management/v1/projects/_search"))
.header("Host", &self.score.host)
.bearer_auth(pat)
.json(&serde_json::json!({}))
.send()
.await
.map_err(|e| format!("Failed to search projects: {e}"))?;
let result: ProjectSearchResult = resp
.json()
.await
.map_err(|e| format!("Failed to parse project search: {e}"))?;
Ok(result
.result
.unwrap_or_default()
.into_iter()
.find(|p| p.name == name)
.map(|p| p.id))
}
async fn create_project(
&self,
client: &reqwest::Client,
pat: &str,
name: &str,
) -> Result<String, String> {
let resp = client
.post(self.api_url("/management/v1/projects"))
.header("Host", &self.score.host)
.bearer_auth(pat)
.json(&serde_json::json!({
"name": name,
"projectRoleAssertion": true
}))
.send()
.await
.map_err(|e| format!("Failed to create project: {e}"))?;
if !resp.status().is_success() {
let body = resp.text().await.unwrap_or_default();
return Err(format!("Create project failed: {body}"));
}
let result: ProjectResponse =
serde_json::from_str(&resp.text().await.map_err(|e| format!("Read body: {e}"))?)
.map_err(|e| format!("Parse project response: {e}"))?;
Ok(result.id)
}
async fn find_app(
&self,
client: &reqwest::Client,
pat: &str,
project_id: &str,
app_name: &str,
) -> Result<Option<String>, String> {
let resp = client
.post(self.api_url(&format!(
"/management/v1/projects/{project_id}/apps/_search"
)))
.header("Host", &self.score.host)
.bearer_auth(pat)
.json(&serde_json::json!({}))
.send()
.await
.map_err(|e| format!("Failed to search apps: {e}"))?;
let result: AppSearchResult = resp
.json()
.await
.map_err(|e| format!("Failed to parse app search: {e}"))?;
Ok(result
.result
.unwrap_or_default()
.into_iter()
.find(|a| a.name == app_name)
.and_then(|a| a.oidc_config.and_then(|c| c.client_id)))
}
async fn create_device_code_app(
&self,
client: &reqwest::Client,
pat: &str,
project_id: &str,
app_name: &str,
) -> Result<String, String> {
let resp = client
.post(self.api_url(&format!("/management/v1/projects/{project_id}/apps/oidc")))
.header("Host", &self.score.host)
.bearer_auth(pat)
.json(&serde_json::json!({
"name": app_name,
"redirectUris": [],
"responseTypes": ["OIDC_RESPONSE_TYPE_CODE"],
"grantTypes": ["OIDC_GRANT_TYPE_DEVICE_CODE"],
"appType": "OIDC_APP_TYPE_NATIVE",
"authMethodType": "OIDC_AUTH_METHOD_TYPE_NONE"
}))
.send()
.await
.map_err(|e| format!("Failed to create app: {e}"))?;
if !resp.status().is_success() {
let body = resp.text().await.unwrap_or_default();
return Err(format!("Create app failed: {body}"));
}
let result: AppResponse =
serde_json::from_str(&resp.text().await.map_err(|e| format!("Read body: {e}"))?)
.map_err(|e| format!("Parse app response: {e}"))?;
result
.client_id
.ok_or_else(|| "No clientId in app response".to_string())
}
async fn ensure_app(
&self,
client: &reqwest::Client,
pat: &str,
app: &ZitadelApplication,
config: &mut ZitadelClientConfig,
) -> Result<String, InterpretError> {
// Check cache first
if let Some(client_id) = config.client_id(&app.app_name) {
debug!(
"[ZitadelSetup] App '{}' found in cache: {}",
app.app_name, client_id
);
return Ok(client_id.clone());
}
// Ensure project exists
let project_id = if let Some(id) = &config.project_id {
id.clone()
} else {
let id = match self.find_project(client, pat, &app.project_name).await {
Ok(Some(id)) => {
info!(
"[ZitadelSetup] Project '{}' already exists: {}",
app.project_name, id
);
id
}
Ok(None) => {
let id = self
.create_project(client, pat, &app.project_name)
.await
.map_err(InterpretError::new)?;
info!(
"[ZitadelSetup] Project '{}' created: {}",
app.project_name, id
);
id
}
Err(e) => return Err(InterpretError::new(e)),
};
config.project_id = Some(id.clone());
id
};
// Check if app already exists
if let Some(client_id) = self
.find_app(client, pat, &project_id, &app.app_name)
.await
.map_err(InterpretError::new)?
{
info!(
"[ZitadelSetup] App '{}' already exists: {}",
app.app_name, client_id
);
config.apps.insert(app.app_name.clone(), client_id.clone());
return Ok(client_id);
}
// Create app
let client_id = match &app.app_type {
ZitadelAppType::DeviceCode => self
.create_device_code_app(client, pat, &project_id, &app.app_name)
.await
.map_err(InterpretError::new)?,
};
info!(
"[ZitadelSetup] App '{}' created: {}",
app.app_name, client_id
);
config.apps.insert(app.app_name.clone(), client_id.clone());
Ok(client_id)
}
}
#[async_trait]
impl<T: Topology + K8sclient> Interpret<T> for ZitadelSetupInterpret {
async fn execute(
&self,
_inventory: &Inventory,
topology: &T,
) -> Result<Outcome, InterpretError> {
let k8s = topology
.k8s_client()
.await
.map_err(|e| InterpretError::new(format!("Failed to get K8s client: {e}")))?;
let pat = self.read_admin_pat(&k8s).await?;
debug!("[ZitadelSetup] Admin PAT loaded from secret");
let client = self.http_client().map_err(InterpretError::new)?;
let mut config = ZitadelClientConfig::load().unwrap_or(ZitadelClientConfig {
project_id: None,
apps: std::collections::HashMap::new(),
});
let mut details = Vec::new();
for app in &self.score.applications {
let client_id = self.ensure_app(&client, &pat, app, &mut config).await?;
details.push(format!("{}={}", app.app_name, client_id));
}
// TODO: machine user provisioning (future iteration)
if !self.score.machine_users.is_empty() {
warn!("[ZitadelSetup] Machine user provisioning not yet implemented");
}
config.save().map_err(InterpretError::new)?;
Ok(Outcome {
status: InterpretStatus::SUCCESS,
message: "Zitadel identity resources provisioned".to_string(),
details,
})
}
fn get_name(&self) -> InterpretName {
InterpretName::Custom("ZitadelSetup")
}
fn get_version(&self) -> Version {
todo!()
}
fn get_status(&self) -> InterpretStatus {
todo!()
}
fn get_children(&self) -> Vec<Id> {
vec![]
}
}

View File

@@ -25,6 +25,7 @@ cli = [
"dep:clap",
"dep:indicatif",
"dep:inquire",
"dep:env_logger",
]
reqwest = ["dep:reqwest"]
@@ -41,9 +42,10 @@ async-trait.workspace = true
url.workspace = true
# CLI only
clap = { version = "4.5", features = ["derive"], optional = true }
clap = { version = "4.5", features = ["derive", "env"], optional = true }
indicatif = { version = "0.18", optional = true }
inquire = { version = "0.7", optional = true }
env_logger = { version = "0.11", optional = true }
# S3 only
aws-sdk-s3 = { version = "1", optional = true }

View File

@@ -78,3 +78,56 @@ pub struct StoredAsset {
pub size: u64,
pub key: String,
}
#[cfg(test)]
mod tests {
use super::*;
fn test_asset(checksum: &str) -> Asset {
Asset::new(
Url::parse("https://example.com/test.iso").unwrap(),
checksum.to_string(),
ChecksumAlgo::BLAKE3,
"test.iso".to_string(),
)
}
#[test]
fn asset_path_uses_checksum_prefix_and_filename() {
let cache = LocalCache::new(PathBuf::from("/tmp/test_cache"));
let asset = test_asset("abcdef1234567890abcdef1234567890");
let path = cache.path_for(&asset);
assert_eq!(
path,
PathBuf::from("/tmp/test_cache/abcdef1234567890/test.iso")
);
}
#[test]
fn asset_path_handles_short_checksum() {
let cache = LocalCache::new(PathBuf::from("/tmp/test_cache"));
let asset = test_asset("abc");
let path = cache.path_for(&asset);
assert_eq!(path, PathBuf::from("/tmp/test_cache/abc/test.iso"));
}
#[test]
fn cache_key_dir_is_prefix_only() {
let cache = LocalCache::new(PathBuf::from("/tmp/test_cache"));
let asset = test_asset("abcdef1234567890abcdef1234567890");
let dir = cache.cache_key_dir(&asset);
assert_eq!(dir, PathBuf::from("/tmp/test_cache/abcdef1234567890"));
}
#[test]
fn asset_with_size() {
let asset = test_asset("abc").with_size(1024);
assert_eq!(asset.size, Some(1024));
}
#[test]
fn formatted_checksum_includes_algo_prefix() {
let asset = test_asset("deadbeef");
assert_eq!(asset.formatted_checksum(), "blake3:deadbeef");
}
}

View File

@@ -1,6 +1,7 @@
pub mod checksum;
pub mod download;
pub mod upload;
pub mod upload_folder;
pub mod verify;
use clap::{Parser, Subcommand};
@@ -18,15 +19,21 @@ pub struct Cli {
#[derive(Subcommand, Debug)]
pub enum Commands {
/// Upload a single file to S3
Upload(upload::UploadArgs),
/// Upload an entire directory to S3, preserving structure
UploadFolder(upload_folder::UploadFolderArgs),
/// Download a file with checksum verification
Download(download::DownloadArgs),
/// Compute checksum for a local file
Checksum(checksum::ChecksumArgs),
/// Verify a file against an expected checksum
Verify(verify::VerifyArgs),
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
log::info!("Starting harmony_assets CLI");
env_logger::init();
let cli = Cli::parse();
@@ -34,6 +41,9 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
Commands::Upload(args) => {
upload::execute(args).await?;
}
Commands::UploadFolder(args) => {
upload_folder::execute(args).await?;
}
Commands::Download(args) => {
download::execute(args).await?;
}

View File

@@ -9,7 +9,8 @@ pub struct UploadArgs {
pub key: Option<String>,
#[arg(short, long)]
pub content_type: Option<String>,
#[arg(short, long, default_value_t = true)]
/// Set uploaded object to public-read ACL (default: private)
#[arg(long, default_value_t = false)]
pub public_read: bool,
#[arg(short, long)]
pub endpoint: Option<String>,

View File

@@ -0,0 +1,220 @@
use clap::Parser;
use harmony_assets::{S3Config, S3Store};
use indicatif::{ProgressBar, ProgressStyle};
use std::path::Path;
#[derive(Parser, Debug)]
pub struct UploadFolderArgs {
/// Local directory to upload
pub source: String,
/// S3 key prefix (folder path in bucket). Defaults to directory name.
#[arg(short, long)]
pub key_prefix: Option<String>,
/// Set uploaded objects to public-read ACL
#[arg(long, default_value_t = false)]
pub public_read: bool,
/// S3-compatible endpoint URL (e.g. https://s3.example.com)
#[arg(short, long, env = "S3_ENDPOINT")]
pub endpoint: Option<String>,
/// S3 bucket name
#[arg(short, long, env = "S3_BUCKET")]
pub bucket: Option<String>,
/// S3 region
#[arg(short, long, env = "S3_REGION")]
pub region: Option<String>,
/// AWS access key ID (falls back to standard AWS credential chain)
#[arg(long, env = "AWS_ACCESS_KEY_ID")]
pub access_key_id: Option<String>,
/// AWS secret access key (falls back to standard AWS credential chain)
#[arg(long, env = "AWS_SECRET_ACCESS_KEY")]
pub secret_access_key: Option<String>,
/// Skip confirmation prompt
#[arg(short, long, default_value_t = false)]
pub yes: bool,
}
fn guess_content_type(path: &Path) -> Option<String> {
match path.extension().and_then(|e| e.to_str()) {
Some("iso") => Some("application/x-iso9660-image".into()),
Some("img") | Some("raw") => Some("application/octet-stream".into()),
Some("gz") | Some("tgz") => Some("application/gzip".into()),
Some("xz") => Some("application/x-xz".into()),
Some("bz2") => Some("application/x-bzip2".into()),
Some("tar") => Some("application/x-tar".into()),
Some("zip") => Some("application/zip".into()),
Some("vmlinuz") | Some("initramfs") | Some("kernel") => {
Some("application/octet-stream".into())
}
Some("json") => Some("application/json".into()),
Some("yaml") | Some("yml") => Some("application/x-yaml".into()),
Some("txt") | Some("cfg") | Some("conf") => Some("text/plain".into()),
Some("html") | Some("htm") => Some("text/html".into()),
Some("ipxe") => Some("text/plain".into()),
_ => None,
}
}
/// Count files recursively in a directory.
async fn count_files(dir: &Path) -> Result<(usize, u64), Box<dyn std::error::Error>> {
let mut count = 0usize;
let mut total_size = 0u64;
let mut stack = vec![dir.to_path_buf()];
while let Some(d) = stack.pop() {
let mut entries = tokio::fs::read_dir(&d).await?;
while let Some(entry) = entries.next_entry().await? {
let path = entry.path();
if path.is_dir() {
stack.push(path);
} else if path.is_file() {
count += 1;
total_size += tokio::fs::metadata(&path).await?.len();
}
}
}
Ok((count, total_size))
}
pub async fn execute(args: UploadFolderArgs) -> Result<(), Box<dyn std::error::Error>> {
let source_dir = Path::new(&args.source);
if !source_dir.is_dir() {
eprintln!("Error: Not a directory: {}", args.source);
std::process::exit(1);
}
let dir_name = source_dir
.file_name()
.and_then(|n| n.to_str())
.unwrap_or("upload");
let key_prefix = args.key_prefix.unwrap_or_else(|| dir_name.to_string());
let bucket = args.bucket.unwrap_or_else(|| {
inquire::Text::new("S3 Bucket name:")
.with_default("harmony-assets")
.prompt()
.unwrap()
});
let region = args.region.unwrap_or_else(|| {
inquire::Text::new("S3 Region:")
.with_default("us-east-1")
.prompt()
.unwrap()
});
let config = S3Config {
endpoint: args.endpoint.clone(),
bucket: bucket.clone(),
region: region.clone(),
access_key_id: args.access_key_id,
secret_access_key: args.secret_access_key,
public_read: args.public_read,
};
let (file_count, total_size) = count_files(source_dir).await?;
println!("Upload Folder Configuration:");
println!(" Source: {}", args.source);
println!(" Key prefix: {}", key_prefix);
println!(" Bucket: {}", bucket);
println!(" Region: {}", region);
if let Some(ref ep) = args.endpoint {
println!(" Endpoint: {}", ep);
}
println!(
" ACL: {}",
if args.public_read {
"public-read"
} else {
"private"
}
);
println!(
" Files: {} ({:.2} MB total)",
file_count,
total_size as f64 / 1024.0 / 1024.0
);
println!();
if file_count == 0 {
println!("No files to upload.");
return Ok(());
}
if !args.yes {
let confirm = inquire::Confirm::new("Proceed with upload?")
.with_default(true)
.prompt()?;
if !confirm {
println!("Upload cancelled.");
return Ok(());
}
}
let store = S3Store::new(config)
.await
.map_err(|e| format!("Failed to initialize S3 client: {}", e))?;
let pb = ProgressBar::new(file_count as u64);
pb.set_style(
ProgressStyle::default_bar()
.template("{spinner:.green} [{elapsed_precise}] [{bar:40}] {pos}/{len} files ({msg})")?
.progress_chars("=>-"),
);
let results = store
.store_folder(
source_dir,
&key_prefix,
Some(&|path: &Path| {
pb.set_message(
path.file_name()
.and_then(|n| n.to_str())
.unwrap_or("")
.to_string(),
);
pb.inc(1);
guess_content_type(path)
}),
)
.await;
pb.finish_with_message("done");
match results {
Ok(assets) => {
println!("\nUpload complete! {} files uploaded.\n", assets.len());
for asset in &assets {
println!(
" {} ({} bytes, {}:{})",
asset.key,
asset.size,
asset.checksum_algo.name(),
&asset.checksum[..16]
);
}
if let Some(first) = assets.first() {
let base_url = first
.url
.as_str()
.strip_suffix(&first.key)
.unwrap_or(first.url.as_str());
println!("\nBase URL: {}{}/", base_url, key_prefix);
}
Ok(())
}
Err(e) => {
eprintln!("\nUpload failed: {}", e);
std::process::exit(1);
}
}
}

View File

@@ -135,3 +135,160 @@ impl AssetStore for LocalStore {
.map_err(|_| AssetError::StoreError("Could not convert path to file URL".to_string()))
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::hash::ChecksumAlgo;
#[test]
fn local_store_default_uses_cache_dir() {
let store = LocalStore::default();
assert!(!store.base_dir.to_string_lossy().is_empty());
}
#[test]
fn exists_returns_false_for_missing_key() {
let dir = tempfile::tempdir().unwrap();
let store = LocalStore::new(dir.path().to_path_buf());
let result = tokio_test::block_on(store.exists("nonexistent"));
assert_eq!(result.unwrap(), false);
}
#[test]
fn exists_returns_true_for_present_key() {
let dir = tempfile::tempdir().unwrap();
std::fs::write(dir.path().join("file.txt"), b"hello").unwrap();
let store = LocalStore::new(dir.path().to_path_buf());
let result = tokio_test::block_on(store.exists("file.txt"));
assert_eq!(result.unwrap(), true);
}
#[test]
fn url_for_returns_file_url() {
let store = LocalStore::new(PathBuf::from("/tmp/test"));
let url = store.url_for("subdir/file.iso").unwrap();
assert_eq!(url.scheme(), "file");
assert!(url.path().ends_with("/tmp/test/subdir/file.iso"));
}
#[cfg(feature = "reqwest")]
mod download_tests {
use super::*;
use httptest::{Expectation, Server, matchers::request, responders::*};
fn test_asset_with_url(url: &str, checksum: &str) -> Asset {
Asset::new(
Url::parse(url).unwrap(),
checksum.to_string(),
ChecksumAlgo::BLAKE3,
"test.bin".to_string(),
)
}
#[tokio::test]
async fn download_and_cache_file() {
let server = Server::run();
let content = b"test file content for download";
server.expect(
Expectation::matching(request::method_path("GET", "/test.bin"))
.respond_with(status_code(200).body(content.as_ref())),
);
let cache_dir = tempfile::tempdir().unwrap();
let cache = LocalCache::new(cache_dir.path().to_path_buf());
let store = LocalStore::new(cache_dir.path().to_path_buf());
// Compute expected checksum
let checksum = {
let tmp = tempfile::NamedTempFile::new().unwrap();
std::io::Write::write_all(&mut tmp.as_file(), content).unwrap();
crate::checksum_for_path(tmp.path(), ChecksumAlgo::BLAKE3)
.await
.unwrap()
};
let url = server.url("/test.bin").to_string();
let asset = test_asset_with_url(&url, &checksum);
let result = store.fetch(&asset, &cache, None).await;
assert!(result.is_ok(), "fetch failed: {:?}", result.err());
let path = result.unwrap();
assert!(path.exists());
assert_eq!(tokio::fs::read(&path).await.unwrap(), content);
}
#[tokio::test]
async fn download_returns_cached_on_second_fetch() {
let server = Server::run();
let content = b"cached content";
// Only expect one request — second fetch should hit cache
server.expect(
Expectation::matching(request::method_path("GET", "/cached.bin"))
.times(1)
.respond_with(status_code(200).body(content.as_ref())),
);
let cache_dir = tempfile::tempdir().unwrap();
let cache = LocalCache::new(cache_dir.path().to_path_buf());
let store = LocalStore::new(cache_dir.path().to_path_buf());
let checksum = {
let tmp = tempfile::NamedTempFile::new().unwrap();
std::io::Write::write_all(&mut tmp.as_file(), content).unwrap();
crate::checksum_for_path(tmp.path(), ChecksumAlgo::BLAKE3)
.await
.unwrap()
};
let url = server.url("/cached.bin").to_string();
let asset = test_asset_with_url(&url, &checksum);
let path1 = store.fetch(&asset, &cache, None).await.unwrap();
let path2 = store.fetch(&asset, &cache, None).await.unwrap();
assert_eq!(path1, path2);
}
#[tokio::test]
async fn download_fails_on_checksum_mismatch() {
let server = Server::run();
server.expect(
Expectation::matching(request::method_path("GET", "/bad.bin"))
.respond_with(status_code(200).body("actual content")),
);
let cache_dir = tempfile::tempdir().unwrap();
let cache = LocalCache::new(cache_dir.path().to_path_buf());
let store = LocalStore::new(cache_dir.path().to_path_buf());
let url = server.url("/bad.bin").to_string();
let asset = test_asset_with_url(
&url,
"0000000000000000000000000000000000000000000000000000000000000000",
);
let result = store.fetch(&asset, &cache, None).await;
assert!(matches!(result, Err(AssetError::ChecksumMismatch { .. })));
}
#[tokio::test]
async fn download_fails_on_404() {
let server = Server::run();
server.expect(
Expectation::matching(request::method_path("GET", "/missing.bin"))
.respond_with(status_code(404)),
);
let cache_dir = tempfile::tempdir().unwrap();
let cache = LocalCache::new(cache_dir.path().to_path_buf());
let store = LocalStore::new(cache_dir.path().to_path_buf());
let url = server.url("/missing.bin").to_string();
let asset = test_asset_with_url(&url, "deadbeef");
let result = store.fetch(&asset, &cache, None).await;
assert!(matches!(result, Err(AssetError::DownloadFailed(_))));
}
}
}

View File

@@ -39,14 +39,32 @@ pub struct S3Store {
impl S3Store {
pub async fn new(config: S3Config) -> Result<Self, AssetError> {
let mut cfg_builder = aws_config::defaults(aws_config::BehaviorVersion::latest());
let mut cfg_builder = aws_config::defaults(aws_config::BehaviorVersion::latest())
.region(aws_config::Region::new(config.region.clone()));
if let Some(ref endpoint) = config.endpoint {
cfg_builder = cfg_builder.endpoint_url(endpoint);
}
if let (Some(key), Some(secret)) = (&config.access_key_id, &config.secret_access_key) {
cfg_builder = cfg_builder.credentials_provider(aws_sdk_s3::config::Credentials::new(
key,
secret,
None,
None,
"harmony_assets",
));
}
let cfg = cfg_builder.load().await;
let client = S3Client::new(&cfg);
let mut s3_config_builder = aws_sdk_s3::config::Builder::from(&cfg);
// For custom endpoints (Ceph, MinIO), force path-style addressing
if config.endpoint.is_some() {
s3_config_builder = s3_config_builder.force_path_style(true);
}
let client = S3Client::from_conf(s3_config_builder.build());
Ok(Self { client, config })
}
@@ -124,6 +142,65 @@ impl S3Store {
key: key.to_string(),
})
}
/// Upload an entire directory to S3, preserving relative paths as key prefixes.
///
/// Files are uploaded with their path relative to `source_dir` appended to `key_prefix`.
/// Returns a list of `StoredAsset` for each uploaded file.
pub async fn store_folder(
&self,
source_dir: &Path,
key_prefix: &str,
content_type_fn: Option<&dyn Fn(&Path) -> Option<String>>,
) -> Result<Vec<StoredAsset>, AssetError> {
if !source_dir.is_dir() {
return Err(AssetError::IoError(std::io::Error::new(
std::io::ErrorKind::NotADirectory,
format!("{} is not a directory", source_dir.display()),
)));
}
let mut results = Vec::new();
let mut stack = vec![source_dir.to_path_buf()];
while let Some(dir) = stack.pop() {
let mut entries = tokio::fs::read_dir(&dir)
.await
.map_err(AssetError::IoError)?;
while let Some(entry) = entries.next_entry().await.map_err(AssetError::IoError)? {
let path = entry.path();
if path.is_dir() {
stack.push(path);
} else if path.is_file() {
let relative = path
.strip_prefix(source_dir)
.map_err(|e| AssetError::StoreError(e.to_string()))?;
let key = if key_prefix.is_empty() {
relative.to_string_lossy().to_string()
} else {
format!(
"{}/{}",
key_prefix.trim_end_matches('/'),
relative.to_string_lossy()
)
};
let ct = content_type_fn.and_then(|f| f(&path));
log::info!(
"Uploading {} -> s3://{}/{}",
path.display(),
self.config.bucket,
key
);
let stored = self.store(&path, &key, ct.as_deref()).await?;
results.push(stored);
}
}
}
Ok(results)
}
}
use crate::store::AssetStore;
@@ -233,3 +310,72 @@ fn extract_s3_key(url: &Url, bucket: &str) -> Result<String, AssetError> {
Ok(path.to_string())
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn extract_key_strips_bucket_prefix() {
let url = Url::parse("https://s3.example.com/my-bucket/path/to/file.iso").unwrap();
let key = extract_s3_key(&url, "my-bucket").unwrap();
assert_eq!(key, "path/to/file.iso");
}
#[test]
fn extract_key_returns_full_path_when_no_bucket_prefix() {
let url = Url::parse("https://cdn.example.com/assets/file.iso").unwrap();
let key = extract_s3_key(&url, "other-bucket").unwrap();
assert_eq!(key, "assets/file.iso");
}
#[test]
fn extract_key_returns_empty_for_bucket_only() {
let url = Url::parse("https://s3.example.com/my-bucket").unwrap();
let key = extract_s3_key(&url, "my-bucket").unwrap();
assert_eq!(key, "");
}
#[test]
fn s3_config_default_region() {
let config = S3Config::default();
assert_eq!(config.region, "us-east-1");
assert!(config.endpoint.is_none());
assert!(config.access_key_id.is_none());
assert!(config.secret_access_key.is_none());
}
#[test]
fn public_url_with_custom_endpoint() {
// We can't call public_url without an S3Store, but we can test the logic
let config = S3Config {
endpoint: Some("https://s3.ceph.local".to_string()),
bucket: "assets".to_string(),
region: "us-east-1".to_string(),
..Default::default()
};
let expected = format!(
"{}/{}/{}",
"https://s3.ceph.local", config.bucket, "boot/kernel.img"
);
assert_eq!(expected, "https://s3.ceph.local/assets/boot/kernel.img");
}
#[test]
fn public_url_with_aws() {
let config = S3Config {
endpoint: None,
bucket: "my-assets".to_string(),
region: "eu-west-1".to_string(),
..Default::default()
};
let expected = format!(
"https://{}.s3.{}.amazonaws.com/{}",
config.bucket, config.region, "boot/kernel.img"
);
assert_eq!(
expected,
"https://my-assets.s3.eu-west-1.amazonaws.com/boot/kernel.img"
);
}
}

View File

@@ -6,7 +6,7 @@ readme.workspace = true
license.workspace = true
[dependencies]
clap = { version = "4.5.35", features = ["derive"] }
clap.workspace = true
tokio.workspace = true
env_logger.workspace = true
log.workspace = true

View File

@@ -37,10 +37,7 @@ async fn main() -> anyhow::Result<()> {
env_logger::init();
let sqlite = SqliteSource::default().await?;
let manager = ConfigManager::new(vec![
Arc::new(EnvSource),
Arc::new(sqlite),
]);
let manager = ConfigManager::new(vec![Arc::new(EnvSource), Arc::new(sqlite)]);
info!("1. Attempting to get TestConfig (expect NotFound on first run)...");
match manager.get::<TestConfig>().await {
@@ -74,7 +71,10 @@ async fn main() -> anyhow::Result<()> {
count: 99,
};
unsafe {
std::env::set_var("HARMONY_CONFIG_TestConfig", serde_json::to_string(&env_config)?);
std::env::set_var(
"HARMONY_CONFIG_TestConfig",
serde_json::to_string(&env_config)?,
);
}
let from_env: TestConfig = manager.get().await?;
info!(" Got from env: {:?}", from_env);

Some files were not shown because too many files have changed in this diff Show More