feat/opnsense-codegen #256

Merged
johnride merged 121 commits from feat/opnsense-codegen into master 2026-04-09 21:05:05 +00:00

121 Commits

Author SHA1 Message Date
6d4bd3ce8f chore: fix formatting
All checks were successful
Run Check Script / check (pull_request) Successful in 1m50s
2026-04-09 17:02:06 -04:00
5420732240 Merge remote-tracking branch 'origin/master' into feat/opnsense-codegen
Some checks failed
Run Check Script / check (pull_request) Failing after 34s
2026-04-09 16:59:03 -04:00
608e8b3b9d Merge pull request 'feat/opnsense-codegen-type-safe' (#257) from feat/opnsense-codegen-type-safe into feat/opnsense-codegen
All checks were successful
Run Check Script / check (pull_request) Successful in 1m56s
Reviewed-on: #257
2026-04-09 20:54:46 +00:00
ab78143a45 fix(ci): update submodules in ci doing a shallow clone to save some space and download time
All checks were successful
Run Check Script / check (pull_request) Successful in 1m52s
2026-04-09 16:42:04 -04:00
b18974ba6c chore: run clippy and format on network_stress_test crate
Some checks failed
Run Check Script / check (pull_request) Failing after 1m46s
2026-04-07 10:06:50 -04:00
f43620cff8 fix: eliminate shell injection vector in WebGuiConfigScore, fix unused var
Code review findings:
- WebGuiConfigScore: write PHP script to temp file via SSH instead of
  inline php -r with shell escaping. Eliminates the shell quoting
  attack surface entirely (port is u16 so was safe, but the pattern
  of format!() into shell commands is a code smell)
- parser.rs: prefix unused parameter with underscore
2026-04-07 08:04:16 -04:00
3fe50c8063 fix: add settle time after firmware update reboot
The web UI responds before the API backend (configd/PHP) is fully
ready after a firmware update reboot. Adding 10s settle time between
web UI detection and package install retry fixes the timeout.

Full --full cold start now completes in ~174 seconds:
  Boot + bootstrap: 48s
  Firmware update + reboot: 60s
  Package installs + 12 Scores x2 + idempotency check: 66s
2026-04-07 06:58:59 -04:00
ff47719f21 fix: use correct port after firmware update, set VM to 3 vCPU / 2GB
The firmware update path waited on port 443 after reboot, but the
webgui config persists across reboots (config.xml stays at 9443).
Changed to use OPN_API_PORT which goes through wait_for_https with
port fallback (tries 9443 first, then 443).

Also increase VM resources from 1 vCPU / 1GB to 3 vCPU / 2GB —
cold boot drops from >10 minutes to ~48 seconds.
2026-04-07 06:46:27 -04:00
02cb709888 chore: cargo fmt on wait_for_https port fallback 2026-04-06 21:49:50 -04:00
65367fd7a0 fix: increase boot timeout to 10min, add port fallback in wait_for_https
- Increase VM boot wait from 5 to 10 minutes (cold OPNsense nano first
  boot with filesystem expansion can exceed 5 minutes)
- wait_for_https now tries target port first, then falls back to 443
  on each attempt (handles both fresh VMs on port 443 and already-
  bootstrapped VMs on custom port)
- cargo fmt on network_stress_test and webgui.rs
2026-04-06 19:00:23 -04:00
c7aead7532 feat: extract WebGuiConfigScore from bootstrap, document dependency use case
Split the OPNsense webgui port change from OPNsenseBootstrap into a
proper idempotent Score:

- WebGuiConfigScore reads current port via SSH before changing
- Returns NOOP if already on the target port
- Modifies config.xml via PHP and restarts webgui via configctl
- Runs before LoadBalancerScore to free port 443 for HAProxy

Also:
- Add Config::shell() accessor for SSH access from Scores
- Add WebGuiConfigScore to VM integration example (12 Scores now)
- Document the WebGuiConfig → LoadBalancer ordering dependency
  as a concrete use case in docs/architecture-challenges.md
  (ties into Challenge #2: Runtime Plan & Validation)

The implicit dependency (LoadBalancerScore needs 443 free, which
requires WebGuiConfigScore to run first) remains a convention-based
ordering. This is tracked in architecture-challenges.md alongside
the score_with_dep.rs design sketch.
2026-04-06 17:00:27 -04:00
35696c24b3 chore: cargo fmt, update OPNsense example documentation
- Run cargo fmt across opnsense-api, opnsense-config, opnsense-codegen
  (fixes formatting in generated files and hand-written modules)
- Update examples/opnsense/README.md: replace stale VirtualBox docs
  with current API key + cargo run instructions
- Update examples/opnsense_vm_integration/README.md: document
  idempotency test (run twice, assert zero duplicates), add
  build/opnsense-e2e.sh usage instructions
2026-04-06 15:56:59 -04:00
575bcdfba6 fix: surface OPNsense validation errors, fix all required fields
The E2E test revealed that OPNsense validation failures were being
silently swallowed: add/set operations returned {"result": "failed",
"validations": {...}} but the code treated them as success.

Critical fixes:
- add_item/set_item now return Error::Validation on failure instead
  of silently returning empty/failed responses
- VLAN: set pcp (PriorityCodePoint) — required in OPNsense 26.1
- Firewall filter: set sequence and statetype (KeepState)
- SNAT: set sequence
- BINAT: set sequence and destination_net ("any")
- DNAT: set sequence
- VIP: default advbase=1 and advskew=0 (required even for IP aliases)
- HAProxy backend: set mode, algorithm, persistence_cookiemode enums
- HAProxy frontend: set mode, connectionBehaviour enums

E2E test now passes: all 11 Scores run successfully against a real
OPNsense VM, and the idempotency test (run twice, verify counts
unchanged) confirms zero duplicates.
2026-04-06 15:34:49 -04:00
1351c16861 fix: HAProxy backend/frontend enum fields, use rowCount for search verification
- Set mode, algorithm, persistence_cookiemode on HAProxy backend struct
  (OPNsense requires these fields, empty string causes validation failure)
- Set mode, connectionBehaviour on HAProxy frontend struct (same issue)
- Switch verify_state() to use rowCount from raw JSON search responses
  instead of typed SearchRow deserialization (more reliable with OPNsense
  search API pagination)

Found by running E2E tests against real OPNsense VM.
2026-04-06 15:27:39 -04:00
9c5d6d8005 fix: UuidResponse handles validation failures, fix HAProxy type mappings
- Make UuidResponse.uuid default to empty string so validation failures
  ({"result": "failed", "validations": {...}}) don't cause deserialization
  errors. Add is_failed() helper method.
- Fix HAProxy healthcheck construction: map check_type string to
  HealthcheckType enum (was sending empty string, OPNsense rejected it)
- Fix HAProxy server construction: set mode (ServerMode) and type
  (ServerType) enum fields (were defaulting to empty, OPNsense rejected)

Discovered by running E2E tests against real OPNsense VM — the typed
structs with ..Default::default() sent empty strings for required enum
fields, which OPNsense rejected as validation errors.

Still needed: HAProxy backend mode/algorithm and frontend mode/
connectionBehaviour enums, and fixing search API pagination for
filter/snat/vip verification counts.
2026-04-06 14:49:13 -04:00
499a0e32db feat: add idempotency testing and E2E test script
Add idempotency verification to the OPNsense VM integration example:
- Extract verify_state() that queries all entity counts via typed API
  (uses DNatApi, FilterApi, SourceNatApi, VipSettingsApi)
- Extract build_all_scores() for reuse across runs
- Run all Scores twice, assert entity counts are unchanged on 2nd run
- This catches duplicate creation in VLAN, LAGG, firewall rules, etc.

Add build/opnsense-e2e.sh — CI-friendly script that:
- Checks prerequisites (libvirtd, user groups)
- Boots OPNsense VM via KVM (idempotent — skips if running)
- Runs full Score suite with idempotency verification
- Supports --download, --clean flags
2026-04-06 13:40:00 -04:00
39e4dac27a fix: improve idempotency and type safety across OPNsense modules
- Remove stale FIXME/TODO comments on VlanScore and LaggScore — both
  already use idempotent ensure_* methods that check before creating
- Fix DNAT apply pattern: remove per-rule apply() from ensure_rule(),
  add single apply() call in DnatInterpret::execute() after all rules
  (matching the pattern used by FirewallRuleScore and OutboundNatScore)
- Make DnatConfig::apply() public so callers can batch applies
- Add typed ensure_binat_rule_from() to FirewallFilterConfig, removing
  the last json!() construction in the harmony Score layer
- BinatInterpret now uses typed method instead of manual json!()
2026-04-06 13:36:19 -04:00
9753779274 refactor: replace json!() with typed structs in dnsmasq and load_balancer
Complete the refactoring of all opnsense-config modules:
- dnsmasq.rs: uses DnsmasqHost, DnsmasqDhcpRang, DhcpRangDomainType
  structs + SettingsApi typed client for all CRUD operations
- load_balancer.rs: uses OpNsenseHaProxyServersServer,
  OpNsenseHaProxyBackendsBackend, OpNsenseHaProxyFrontendsFrontend,
  OpNsenseHaProxyHealthchecksHealthcheck structs with correct field
  types (required String vs Option, u16 vs String, bool vs Option)

All 10 opnsense-config modules now have zero json!() in production
code. The only remaining json!() calls are in test mock responses.
2026-04-06 12:46:32 -04:00
5ef9f8b187 refactor: replace json!() with typed structs in vlan, lagg, vip, caddy, tftp, node_exporter
Six more modules migrated to typed APIs:
- vlan.rs: uses VlansVlan struct + VlanSettingsApi, reads VlansResponse
- lagg.rs: uses LaggsLagg struct + LaggSettingsApi, reads LaggsResponse
- vip.rs: uses VirtualipVip struct + VipSettingsApi with typed enums
- caddy.rs: typed CaddyGeneralSettings struct instead of json!()
- tftp.rs: typed TftpSettings struct instead of json!()
- node_exporter.rs: typed NodeExporterSettings struct instead of json!()

All six modules now have zero json!() in production code.
2026-04-06 12:33:48 -04:00
68773e1e1e refactor: redesign API codegen for end-to-end type safety
Rewrite api_codegen to generate proper envelope-wrapping methods that
accept model structs directly. Callers no longer need to manually
construct RuleBody wrappers or extract UUIDs from raw JSON.

Key changes:
- Generated API clients wrap request bodies internally via serde rename
  (e.g., add_rule(&my_rule) serializes as {"rule": {...}})
- Add shared SearchRow type to response.rs with label() and is_enabled()
  helpers, eliminating per-module RuleSearchRow type conflicts
- Extract body_key from PHP controller addBase/setBase calls
- Rewrite dnat.rs and firewall.rs to use the typed API end-to-end:
  search returns SearchResponse<SearchRow>, add returns UuidResponse,
  set/del return StatusResponse — zero raw JSON in production code
- Add EnsureApi trait in firewall.rs for generic find-or-create pattern

The only remaining json!() calls in dnat.rs and firewall.rs are in test
mock responses, which is expected.
2026-04-06 12:26:29 -04:00
b0ace7e0ca refactor: replace json!() with typed structs in firewall.rs
Replace manual json!() construction in ensure_filter_rule and
ensure_snat_rule_from with generated typed structs
(FirewallFilterRulesRule, FirewallFilterSnatrulesRule) and their
associated enums for action, direction, ip protocol.

Also generate typed API clients for FilterController, SourceNatController,
and OneToOneController. Add parse_controller_with_defaults for controllers
that inherit model fields from a parent class.

BINAT (one-to-one) ensure still uses json Value as the generated
OnetooneRule struct needs further validation against the wire format.
2026-04-06 11:51:20 -04:00
0201c510f0 refactor: replace json!() with typed structs in dnat.rs
Replace all manual json!() construction and raw Value parsing in
opnsense-config's dnat module with generated typed structs (NatRuleRule,
NatRuleRuleDestination, RuleIpprotocol, RulePass) and the typed DNatApi
client.

Also fix a regression in the parser where the custom *Field handler
trigger condition was too broad, causing InterfaceField/ProtocolField
to be incorrectly treated as ArrayField subclasses. Reverted the trigger
to only match children with type attributes.
2026-04-06 11:44:45 -04:00
ea5bf305b5 feat: add PHP controller parser and typed API client codegen
Add two new codegen modules:
- controller_parser: extracts module, controller path, model binding,
  and CRUD actions from OPNsense PHP API controller files using regex
- api_codegen: generates typed Rust API client wrappers from ControllerIR

Generated typed API clients for 6 controllers:
- DNatController → DNatApi (search, get, add, set, del, toggle)
- FilterBaseController → FilterBaseApi (apply, revert, savepoint)
- VlanSettingsController → VlanSettingsApi (CRUD + reconfigure)
- LaggSettingsController → LaggSettingsApi (CRUD + reconfigure)
- VipSettingsController → VipSettingsApi (CRUD + reconfigure)
- Dnsmasq SettingsController → SettingsApi (host/domain CRUD)

These typed APIs eliminate the need for manual json!() construction
and string-based URL paths in opnsense-config modules.
2026-04-06 11:21:46 -04:00
a032be6ce7 fix: codegen handles container elements in ArrayField and adds opn_map serializer
The XML parser silently skipped container elements (like <source>,
<destination>) nested inside ArrayField nodes because it only processed
children with a type attribute. This caused generated structs to be
missing nested fields, forcing opnsense-config to use json!() macros
instead of typed structs.

- Add container handling in ArrayField and custom *Field child loops
- Add serialize function to opn_map serde helper (was deserialize-only)
- Change opn_map serde attribute from deserialize_with to with
- Regenerate all 9 model files with the fixes

NatRuleRule now correctly has source/destination/created/updated
container structs with all child fields.
2026-04-06 11:14:46 -04:00
5abc1b217c wip fix load balancer idempotency
Some checks failed
Run Check Script / check (pull_request) Failing after 33s
2026-04-02 16:44:48 -04:00
a7d1abd0be wip fix load balancer idempotency
Some checks failed
Run Check Script / check (pull_request) Failing after 9s
2026-04-02 16:34:54 -04:00
92b46c5c08 fix: haproxy listens on 0.0.0.0 and opens wan in the firewall for okd deployment, also disables http redirect rule for opnsense webgui which stole haproxy traffic
Some checks failed
Run Check Script / check (pull_request) Failing after 10s
2026-04-01 22:31:54 -04:00
d937813fd4 ignore stress test config and db files
Some checks failed
Run Check Script / check (pull_request) Failing after 10s
2026-04-01 21:12:08 -04:00
8b5ca51fba chore: Improve opnsene constructor signature
Some checks failed
Run Check Script / check (pull_request) Failing after 11s
2026-04-01 21:10:31 -04:00
3afaa38ba0 feat: Network stress test utility that will randomly flap switch ports and reboot opnsense firewalls while running iperf and report statistics and events in a simple clean ui 2026-04-01 21:09:02 -04:00
1a0e754c7a chore: Note some problems, improve some variables naming around opnsense automation
Some checks failed
Run Check Script / check (pull_request) Failing after 18s
2026-03-31 17:33:04 -04:00
0dc9b80010 chore: fix unused import and add TODO/doc comments from review
Some checks failed
Run Check Script / check (pull_request) Failing after 11s
- Remove unused `warn` import in pair integration example
- Add TODO comment for shared credentials limitation (ROADMAP/11)
- Add doc comments on DhcpServer::get_ip/get_host noting they return
  primary's address, not the CARP VIP
2026-03-31 13:17:44 -04:00
6554ac5341 docs: fix pair integration subnet in diagram, add to examples index
- Fixed network topology diagram in pair README: 192.168.10.x -> 192.168.1.x
  to match the actual code (OPNsense boots on .1 of 192.168.1.0/24)
- Added explanation of NIC juggling to the diagram section
- Updated single-VM "What's next" to link to pair example (was "in progress")
- Added opnsense_pair_integration to examples/README.md table and category
2026-03-31 12:29:35 -04:00
811c56086c fix(kvm): fix domiflist MAC parsing and pair test subnet
- Fixed VmInterface parsing: virsh domiflist has 5 columns (Interface,
  Type, Source, Model, MAC), not 4. MAC is at index 4, not 3.
- Changed pair integration subnet to 192.168.1.0/24 to match OPNsense's
  hard-coded default boot IP of .1.

Tested: full --full pair integration passes end-to-end with CARP VIP
configured on both firewalls (primary advskew=0, backup advskew=100).
2026-03-31 12:26:34 -04:00
34d02d7291 feat(opnsense): add firewall pair VM integration example
Boots two OPNsense VMs, bootstraps both with NIC juggling to handle
the .1 IP conflict, then applies FirewallPairTopology with CarpVipScore.

The bootstrap sequence:
1. Boot both VMs on shared LAN bridge
2. Disable backup's LAN NIC
3. Bootstrap primary on .1, change IP to .2
4. Swap NICs (disable primary, enable backup)
5. Bootstrap backup on .1, change IP to .3
6. Re-enable all NICs
7. Apply pair scores (CARP VIP, VLANs, firewall rules)
8. Verify via API on both firewalls

Supports --full flag for single-shot CI execution.
2026-03-31 12:07:40 -04:00
73785e7336 feat(kvm): add NIC link control for VM interface management
Adds set_interface_link() and list_interfaces() to KvmExecutor,
enabling programmatic up/down control of VM network interfaces by
MAC address.

This is essential for bootstrapping multiple VMs that boot with the
same default IP (e.g., OPNsense on 192.168.1.1) — disable all LAN
NICs, then enable and bootstrap one at a time.

Uses virsh domif-setlink and domiflist under the hood. Tested against
a live KVM VM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 12:02:09 -04:00
8abcb68865 docs: update OPNsense VM integration for fully automated bootstrap
Major rewrite of OPNsense documentation to reflect the new unattended
workflow — no manual browser interaction required.

- Rewrote examples/opnsense_vm_integration/README.md: highlights --full
  CI mode, documents OPNsenseBootstrap automated steps, lists system
  requirements by distro
- Rewrote docs/use-cases/opnsense-vm-integration.md: removed manual
  Step 3 (SSH/webgui), added Phase 2 bootstrap description, updated
  architecture diagram with OPNsenseBootstrap layer
- Added OPNsense VM Integration to docs/README.md (was missing)
- Added OPNsense VM Integration to docs/use-cases/README.md (was missing)
- Added opnsense_vm_integration to examples/README.md quick reference
  table and Infrastructure category (was missing, marked as recommended)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 10:16:42 -04:00
35aab0ecfb fix(opnsense): fix bootstrap webgui port change and add SSH diagnostics
Fixes:
- CSRF token parser now extracts <input> tags individually instead of
  parsing whole lines, fixing the bug where <form name="iform"> on the
  same line as the CSRF hidden input caused the wrong name to be extracted
- extract_selected_option() for <select> dropdowns (webguiproto,
  ssl-certref) which extract_input_value() couldn't handle
- After webgui port change, explicitly restart lighttpd via SSH
  (configctl webgui restart) as a safety net — the PHP configd call
  can fail if lighttpd dies before executing it

Adds:
- diagnose_via_ssh() reports webgui config, listening ports, lighttpd
  process status, and configctl status — invaluable for troubleshooting
- Diagnostic output is shown automatically when wait_for_ready() fails

Tested: full --boot + integration test passes end-to-end with zero
manual interaction on a fresh OPNsense 26.1 VM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 08:43:44 -04:00
ddab4d27eb feat(opnsense): integrate OPNsenseBootstrap into VM integration example
Replaces the manual browser steps (wizard, SSH, webgui port) with
automated OPNsenseBootstrap calls. Adds --full flag for CI-friendly
single-shot boot + test.

Working: login, wizard abort, SSH enable with root+password auth.
In progress: webgui port change (lighttpd falls back to port 80 —
needs fix for <select> dropdown extraction and CSRF token refresh).

Also adds:
- diagnose_via_ssh() for troubleshooting webgui status
- restart_webgui_via_ssh() safety net after port changes
- CSRF parser fix for same-line form+input HTML (real OPNsense layout)
- cookie_store(true) for reliable session management

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:59:04 -04:00
79d8aa39fc feat(opnsense): add OPNsenseBootstrap for unattended first-boot setup
Automates OPNsense initial setup via HTTP session authentication,
eliminating manual browser interaction. The module:

- Logs in with username/password (handles CSRF token extraction)
- Aborts the initial setup wizard via /api/core/initial_setup/abort
- Enables SSH with root login and password auth
- Changes the web GUI port (fire-and-forget, handles server restart)
- Provides wait_for_ready() polling helper

Uses reqwest with cookie jar for session management. No browser or
external dependencies needed — pure Rust HTTP client approach.

Includes unit tests for CSRF token extraction and HTML parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 06:09:30 -04:00
5e8e63ade7 test(opnsense): add unit tests for FirewallPairTopology
Tests cover:
- ensure_ready outcome merging (both Success)
- CarpVipScore applies VIPs to both firewalls with correct advskew
- CarpVipScore custom backup_advskew is respected
- CarpVipScore defaults backup_advskew to 100 when unset
- VlanScore uniform delegation applies to both firewalls

Uses httptest mock HTTP servers to intercept OPNsense API calls
without requiring real firewall devices. Adds httptest dev-dependency
to harmony crate and a #[cfg(test)] from_config constructor on
OPNSenseFirewall for test-friendly instantiation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 05:27:24 -04:00
cb2a650d8b feat(opnsense): add FirewallPairTopology for HA firewall pair management
Introduces a higher-order topology that wraps two OPNSenseFirewall
instances (primary + backup) and orchestrates score application across
both. CARP VIPs get differentiated advskew values (primary=0,
backup=configurable) while all other scores apply identically to both
firewalls.

Includes CarpVipScore, DhcpServer delegation, pair Score impls for all
existing OPNsense scores, and opnsense_from_config() factory method.

Also adds ROADMAP entries for generic firewall trait (10), delegation
macro, integration tests, and named config instances (11).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 05:12:03 -04:00
466a8aafd1 feat(postgresql): add wait_for_ready option to PostgreSQLConfig
Some checks failed
Run Check Script / check (pull_request) Failing after 12s
Add wait_for_ready field (default: true) to PostgreSQLConfig. When
enabled, K8sPostgreSQLInterpret waits for the cluster's -rw service
to exist after applying the Cluster CR, ensuring callers like
get_endpoint() succeed immediately.

This eliminates the retry loop in the harmony_sso example's
deploy_zitadel() -- ZitadelScore now deploys in a single pass because
the PG service is guaranteed to exist before Zitadel's Helm chart
init job tries to connect.

The deploy_zitadel function shrinks from a 5-attempt retry loop to a
simple score.interpret() call.
2026-03-30 08:45:49 -04:00
fabec7ac11 refactor: extract CoreDNSRewriteScore from harmony_sso example
Some checks failed
Run Check Script / check (pull_request) Failing after 11m35s
Move CoreDNS rewrite logic into a reusable Score at
harmony/src/modules/k8s/coredns.rs. The Score patches CoreDNS on
K3sFamily clusters to add name rewrite rules (e.g., mapping
sso.harmony.local to the in-cluster service FQDN).

K3sFamily/Default only, no-op on OpenShift. Idempotent.

The harmony_sso example now uses CoreDNSRewriteScore.interpret()
instead of an inline function.
2026-03-30 07:56:44 -04:00
3e2b8423e8 chore: clean up clippy warnings in zitadel and openbao modules
- Remove unused serde default functions in ZitadelSetupScore
- Replace redundant closures with function references (InterpretError::new)
- Allow dead_code on AppSearchEntry.id (needed for deserialization)
- Fix empty line after doc comment in ZitadelScore
- Remove unneeded return statement in generate_secure_password
2026-03-30 07:46:47 -04:00
c687d4e6b3 docs: add Phase 9 (SSO + Config Hardening) to roadmap
New roadmap phase covering the hardening path for the SSO config
management stack: builder pattern for OpenbaoSecretStore, ZitadelScore
PG readiness fix, CoreDNSRewriteScore, integration tests, and future
capability traits.

Updates current state to reflect implemented Zitadel OIDC integration
and harmony_sso example.
2026-03-30 07:37:24 -04:00
cd48675027 chore: cargo fmt 2026-03-29 09:01:00 -04:00
8cb59cc029 fix: SSO end-to-end fixes for device flow
- OpenbaoSetupScore: verify vault init state before trusting cached
  keys (handles cluster recreation with stale local keys file)
- ZitadelSetupScore: trim PAT whitespace (K8s secret had trailing
  newline that corrupted the Authorization header)
- ZitadelOidcAuth: resolve SSO hostname to 127.0.0.1 via reqwest
  resolve() so device flow works without /etc/hosts entries
- Fix OIDC discovery URL to include port (Zitadel issuer is
  http://sso.harmony.local:8080, not http://sso.harmony.local)

The full SSO flow now works end-to-end: deploy, provision identity,
configure JWT auth, trigger device flow. User sees verification URL
and code in the terminal.
2026-03-29 08:54:28 -04:00
772fcad3d7 refactor(harmony-sso): full SSO flow as default deployment
The example now deploys the complete SSO stack and uses it:

Phase 1: Deploy OpenBao + basic setup (init, unseal, policies, users)
Phase 2: CoreDNS patch + Deploy Zitadel + ZitadelSetupScore (creates
  project + device-code app) + OpenBao JWT auth (with real client_id)
Phase 3: Store config via SSO-authenticated OpenBao (triggers device
  flow on first run, uses cached session on re-run)

Removed --demo and --sso-demo flags. The default run IS the demo.
Kept --skip-zitadel and --cleanup.

On re-run: all deployments are idempotent, cached OIDC session is
reused, config is loaded from OpenBao without login prompt.
2026-03-29 08:37:25 -04:00
80e512caf7 feat(harmony-secret): implement JWT exchange for Zitadel OIDC -> OpenBao
Fix the core SSO authentication flow: instead of storing the Zitadel
access_token as the OpenBao token (which OpenBao doesn't recognize),
exchange the id_token with OpenBao's JWT auth method via
POST /v1/auth/{mount}/login to get a real OpenBao client token.

Changes:
- ZitadelOidcAuth: add openbao_url, jwt_auth_mount, jwt_role fields
- New exchange_jwt_for_openbao_token() method using reqwest (vaultrs
  0.7.4 has no JWT auth module)
- process_token_response() now exchanges id_token when openbao_url is
  set, falls back to access_token for backward compat
- OpenbaoSecretStore::new() accepts optional jwt_role + jwt_auth_mount
- All callers updated (lib.rs, openbao_chain example, harmony_sso)

This implements ADR 020-1 Step 6 (OpenBao JWT exchange).
2026-03-29 08:35:43 -04:00
d0b7c03e12 feat(zitadel): add ZitadelSetupScore for identity provisioning
New Score that provisions identity resources in a deployed Zitadel
instance via the Management API v1:
- Create projects
- Create OIDC applications (device-code grant for CLI/headless)
- Machine user provisioning (stubbed for future iteration)

Authenticates using the admin PAT from the iam-admin-pat K8s secret
(provisioned automatically by the Zitadel Helm chart). No password
extraction or deprecated grant types needed.

All operations are idempotent: checks for existing resources before
creating. Results cached at ~/.local/share/harmony/zitadel/client-config.json.

This is the "day two" counterpart to ZitadelScore, enabling enterprise
automation of identity management (users, machines, applications, groups).
2026-03-29 08:31:49 -04:00
4a66880a84 fix(harmony-k8s): make API discovery cache invalidatable
Replace OnceCell<Discovery> with RwLock<Option<Arc<Discovery>>> so the
cache can be cleared after installing CRDs or operators that register
new API groups.

Add invalidate_discovery() method. Call it in ensure_cnpg_operator()
after confirming the Cluster CRD is registered, so the subsequent
apply() call sees the new CRD without needing a fresh client.

This eliminates the "Cannot resolve GVK" retry loop -- PostgreSQL
Cluster resources now apply on the first attempt after CNPG operator
installation.
2026-03-29 07:30:33 -04:00
ec1bdbab73 feat(harmony-sso): add CoreDNS rewrite for in-cluster hostname resolution
Patch CoreDNS on K3sFamily to add rewrite rules that map external
hostnames (sso.harmony.local, bao.harmony.local) to cluster service
FQDNs. This allows OpenBao's JWT auth to fetch Zitadel's JWKS from
inside the cluster, where Zitadel validates Host headers against its
ExternalDomain.

Uses apply_dynamic with force_conflicts since the CoreDNS ConfigMap
is owned by the k3d deployer. Restarts CoreDNS pods after patching.
No-op on non-K3sFamily distributions (OpenShift, etc.).

Idempotent: skips patching if rewrite rules already present.
2026-03-29 07:22:54 -04:00
09b704e9cf fix(postgresql): wait for CNPG CRD registration after operator install
The CNPG operator deployment being ready does not guarantee that the
Cluster CRD is registered in the API server's discovery cache. This
caused intermittent "Cannot resolve GVK: postgresql.cnpg.io/v1/Cluster"
errors when applying PostgreSQL Cluster resources immediately after
operator installation.

Add wait_for_crd() to harmony-k8s that polls has_crd() until the CRD
appears (2s interval, 60s timeout). Call it in ensure_cnpg_operator()
after the deployment readiness check.

This eliminates the need for retry loops in callers like harmony_sso.
2026-03-29 07:11:34 -04:00
8e3e935459 refactor(harmony-sso): use OpenbaoSetupScore instead of imperative orchestration
Replace ~200 lines of manual init/unseal/configure/jwt-auth code with
a single OpenbaoSetupScore invocation. The deployment path is now:

1. OpenbaoScore (Helm deploy)
2. OpenbaoSetupScore (init, unseal, policies, users, JWT auth)
3. ZitadelScore (CNPG + Helm, with retry)

The example main.rs goes from ~800 lines to ~370 lines. The removed
imperative logic now lives in the reusable OpenbaoSetupScore which can
be tested against any topology.
2026-03-29 06:45:36 -04:00
c388d5234f feat(openbao): add OpenbaoSetupScore for post-deployment lifecycle
New Score that handles the operational complexity of making a deployed
OpenBao instance operational:
- Init (operator init) with local key storage (~/.local/share/harmony/openbao/)
- Unseal (3 of 5 keys)
- Enable KV v2 secrets engine
- Create configurable policies (HCL)
- Enable userpass auth and create users
- Optional JWT auth configuration for OIDC integration

All steps are idempotent. Requires T: Topology + K8sclient.

This encapsulates the tribal knowledge of OpenBao lifecycle management
into a compiled, type-checked Score that can be tested against any
topology (k3d, OpenShift, kubeadm, bare metal).
2026-03-28 23:51:57 -04:00
d9d5ea718f docs: add Score design principles and capability architecture rules
docs/guides/writing-a-score.md:
- Add Design Principles section: capabilities are industry concepts not
  tools, Scores encapsulate operational complexity, idempotency rules,
  no execution order dependencies

CLAUDE.md:
- Add Capability and Score Design Rules section with the swap test:
  if swapping the underlying tool breaks Scores, the capability
  boundary is wrong
2026-03-28 23:48:12 -04:00
5415452f15 refactor(harmony-sso): replace kubectl with typed K8s APIs, add Zitadel deployment
Replace all Command::new("kubectl") calls with harmony-k8s K8sClient
methods:
- wait_for_pod_ready() instead of kubectl get pod jsonpath
- exec_pod_capture_output() for OpenBao init/unseal/configure
- delete_resource<MutatingWebhookConfiguration>() for webhook cleanup
- port_forward() instead of kubectl port-forward subprocess

Thread K3d and K8sClient through all functions instead of
reconstructing context strings. Consolidate path helpers into
harmony_data_dir().

Add Zitadel deployment via ZitadelScore with retry logic for CNPG CRD
registration race and PostgreSQL cluster readiness timing.

Add CLI flags: --demo, --sso-demo, --skip-zitadel, --cleanup.
Add --demo mode: ConfigManager with EnvSource + StoreSource<OpenbaoSecretStore>.
Configure OpenBao with harmony-dev policy, userpass auth, and JWT auth.
2026-03-28 23:48:00 -04:00
b05a341a80 feat(harmony-k8s, k3d): add exec_pod, delete_resource, port_forward, and k3d getters
harmony-k8s:
- exec_pod() and exec_pod_capture_output(): exec commands in pods by
  name (not just label), with proper stdout/stderr capture
- delete_resource<K>(): generic typed delete using ScopeResolver,
  idempotent (404 = Ok)
- port_forward(): native port forwarding via kube-rs Portforwarder +
  tokio TcpListener, replacing kubectl subprocess. Returns
  PortForwardHandle that auto-aborts on drop.

k3d:
- base_dir(), cluster_name(), context_name() public getters

Also adds tokio "net" feature to workspace for TcpListener.
2026-03-28 23:47:42 -04:00
d0252bf1dc wip: harmony_sso example deploying zitadel and openbao seems to be working for config backend!
Some checks failed
Run Check Script / check (pull_request) Failing after 15s
2026-03-28 18:20:01 -04:00
f33d730645 fix(opnsense): improve idempotency in VIP, LAGG, and firewall modules
VIP: Fix subnet matching from starts_with() to exact equality. Previously
"192.168.1.10" would wrongly match a request for "192.168.1.100".

LAGG: Add config diff detection when updating existing LAGGs. Logs a
warning with previous config when protocol, description, or MTU differs
from desired state.

Firewall: Detect duplicate rules with same description and warn. When
multiple rules share a description, updates the first one and logs a
warning suggesting unique descriptions.

7 new tests proving:
- VIP exact subnet match (rejects prefix match, finds exact, mode check)
- Firewall create/update/duplicate/different-description scenarios
2026-03-28 13:48:29 -04:00
6040e2394e add claude.md
Some checks failed
Run Check Script / check (pull_request) Failing after 16s
2026-03-28 13:33:04 -04:00
a7f9b1037a refactor: push harmony_types enums all the way down to opnsense-api
Some checks failed
Run Check Script / check (pull_request) Failing after 19s
Move vendor-neutral IaC enums to harmony_types::firewall. Add From impls
in opnsense-api::wire converting harmony_types to generated OPNsense
types. Add typed methods in opnsense-config that accept harmony_types
enums and handle wire conversion internally.

Score layer no longer builds serde_json::json!() bodies — it passes
harmony_types enums directly to opnsense-config typed methods:
  ensure_filter_rule(&FirewallAction, &Direction, &IpProtocol, ...)
  ensure_snat_rule_from(&IpProtocol, &NetworkProtocol, ...)
  ensure_dnat_rule(&IpProtocol, &NetworkProtocol, ...)
  ensure_vip_from(&VipMode, ...)
  ensure_lagg(..., &LaggProtocol, ...)

Type flow: harmony_types → Score → opnsense-config → From<> → generated → wire
No strings cross layer boundaries for typed fields.
2026-03-26 11:07:49 -04:00
b98b2aa3f7 refactor: move IaC enums to harmony_types, translate in opnsense-api
Move vendor-neutral firewall and network types (FirewallAction, Direction,
IpProtocol, NetworkProtocol, VipMode, LaggProtocol) from harmony Score
modules to harmony_types::firewall as industry-standard IaC types.

Display impls use human-readable names (IPv4, CARP, LACP) — not wire
format. OPNsense-specific wire translations live in opnsense-api::wire
via the ToOPNsenseValue trait ("inet", "carp", "lacp").

Dependency chain: harmony_types → opnsense-api → opnsense-config → harmony.
Users import types from harmony_types, translations happen transparently
in the infrastructure layer.

Includes 6 new tests verifying all wire value translations.
2026-03-26 10:11:53 -04:00
1b86c895a5 refactor(opnsense): replace stringly-typed fields with enums across Scores
Some checks failed
Run Check Script / check (pull_request) Failing after 19s
Add shared enums for firewall, NAT, and LAGG Score definitions:
- FirewallAction (Pass, Block, Reject)
- Direction (In, Out)
- IpProtocol (Inet, Inet6) — shared across filter, SNAT, DNAT
- NetworkProtocol (Tcp, Udp, TcpUdp, Icmp, Any) — shared across all rule types
- LaggProtocol (Lacp, Failover, LoadBalance, RoundRobin, None)

Combined with the VipMode enum from the previous commit, all OPNsense
Score definitions now use proper types instead of raw strings. Typos in
mode/action/direction/protocol fields are now compile-time errors.
2026-03-26 00:06:40 -04:00
2a15a0d10b refactor(opnsense): use VipMode enum instead of string for VIP mode
Replace the stringly-typed mode field in VipDef with a VipMode enum
(IpAlias, Carp, ProxyArp). Prevents typos and makes the API discoverable
through IDE autocompletion. The as_api_str() method converts to the wire
format expected by OPNsense.
2026-03-25 23:58:01 -04:00
da90dc55ad chore: cargo fmt across workspace
Some checks failed
Run Check Script / check (pull_request) Failing after 19s
2026-03-25 23:20:57 -04:00
516626a0ce docs: add OPNsense VM integration tutorial and architecture challenges
New use-case tutorial walking newcomers through the full OPNsense VM
integration test: system setup, VM boot, SSH config, running all 11
Scores, and understanding the three-layer architecture.

Add architecture-challenges.md analyzing topology evolution during
deployment, runtime plan/validation phase, and TUI as primary interface.
2026-03-25 23:20:45 -04:00
6c664e9f34 docs(roadmap): add phases 7-8 for OPNsense and HA OKD production
Add Phase 7 (OPNsense & Bare-Metal Network Automation) tracking current
progress on OPNsense Scores, codegen, and Brocade integration. Details
the UpdateHostScore requirement and HostNetworkConfigurationScore rework
needed for LAGG LACP 802.3ad.

Add Phase 8 (HA OKD Production Deployment) describing the target
architecture with LAGG/CARP/multi-WAN/BINAT and validation checklist.

Update current state section to reflect opnsense-codegen branch progress.
2026-03-25 23:20:35 -04:00
082ea8a666 feat(harmony): add duration timing to Score::interpret
Every Score execution now logs its status and elapsed time after
completion. The timing is measured in Score::interpret (the central
execution path) so it applies to all Scores automatically.

Example output:
  [VlanScore] SUCCESS in 0.9s — Created 2 VLANs
  [DhcpScore] SUCCESS in 1.8s — Dhcp execution successful
  [LoadBalancerScore] FAILED after 45.3s — connection refused
2026-03-25 23:20:24 -04:00
d33125bba8 feat(okd): automate SCP uploads, implement wait_for_bootstrap_complete
Replace manual scp prompts in bootstrap_02 and ipxe with automated
StaticFilesHttpScore uploads. SCOS installer images and HTTP boot files
now upload via SFTP without operator intervention.

Implement wait_for_bootstrap_complete by shelling out to
openshift-install wait-for bootstrap-complete with stdout/stderr logging.
Previously this was a todo!() that would panic and crash mid-deployment.

Add [Stage 02/Bootstrap] prefixes to all bootstrap_02 log messages.
Improve bootstrap_okd_node outcome to include per-host details with
MAC addresses.
2026-03-25 23:20:16 -04:00
1f0a7ed5a5 feat(opnsense): implement Url::Url support in HTTP and TFTP infra
Replace todo!() in OPNSenseFirewall HTTP and TFTP serve_files with
download-then-upload logic. When a Url::Url is provided, download the
remote file to a temp directory via reqwest, then upload to OPNsense
via the existing SFTP path.

Enables StaticFilesHttpScore and TftpScore to serve files from remote
URLs (e.g. S3) in addition to local folders.
2026-03-25 23:20:07 -04:00
c24fa9315b feat(harmony_assets): S3 credentials, folder upload, 19 tests
Fix S3Store to actually wire access_key_id/secret_access_key from config
into the AWS SDK credential provider. Add force_path_style for custom
endpoints (Ceph, MinIO). Add store_folder() for recursive directory upload.

New CLI command: upload-folder with --public-read/private ACL, env var
fallback for credentials, content-type auto-detection, progress bar.

Fix single-file upload --public-read default (was always true, now false).

Add 19 tests: Asset path computation, LocalStore fetch/cache/404/checksum
with httptest mocks, S3 key extraction, URL generation for custom/AWS
endpoints.
2026-03-25 23:19:58 -04:00
7475e7b75e feat(opnsense): implement remove_static_mapping and list_static_mappings
Wire the existing dnsmasq remove_static_mapping through the OPNSenseFirewall
infra layer. Add list_static_mappings at both config and infra layers for
querying current DHCP host entries. Includes 6 new unit tests with httptest
mocks covering empty, single/multi-MAC, multiple hosts, and skip edge cases.

Foundation for the upcoming UpdateHostScore.
2026-03-25 23:19:47 -04:00
d75ebcbb74 feat(opnsense): VipScore, DnatScore, LaggScore tested with 4-NIC VM
Some checks failed
Run Check Script / check (pull_request) Failing after 16s
Add VIP (IP alias / CARP) and destination NAT (port forwarding) Scores.
Update VM to 4 NICs (LAN, WAN, LAGG member 1, LAGG member 2) so LAGG
can be tested with failover protocol on vtnet2+vtnet3.

All 11 Scores pass end-to-end against OPNsense VM:
- LoadBalancerScore, DhcpScore, TftpScore, NodeExporterScore
- VlanScore (2 VLANs on vtnet0)
- FirewallRuleScore (filter rule with gateway support)
- OutboundNatScore (SNAT), BinatScore (1:1 NAT)
- VipScore (IP alias on LAN)
- DnatScore (port forward 8443→192.168.1.50:443)
- LaggScore (failover LAGG on vtnet2+vtnet3)
2026-03-25 16:59:52 -04:00
cea008e9c9 feat(opnsense): FirewallRuleScore, OutboundNatScore, BinatScore
Add Scores for managing OPNsense new-generation firewall filter rules,
outbound NAT (SNAT), and 1:1 NAT (BINAT) via the REST API.

- opnsense-config: firewall.rs module with idempotent CRUD for filter
  rules, SNAT rules, and BINAT rules (match by description)
- harmony: FirewallRuleScore (with gateway support for multi-WAN),
  OutboundNatScore, BinatScore
- All 3 tested end-to-end against OPNsense VM, idempotent on re-run
- Integration test now exercises 8 Scores total
2026-03-25 16:18:25 -04:00
ac9320fca4 feat(opnsense-codegen): expand custom ArrayField subclasses into full structs
Fix codegen to handle FilterRuleField, SourceNatRuleField, and other
custom *Field types that extend ArrayField. When an XML element has
a custom type AND child elements with type attributes, recursively
parse children into struct fields instead of falling back to
Option<String> stubs.

Also fix hyphenated field names (state-policy → state_policy with
serde rename) and avoid enum name collisions by using the full struct
name as prefix for custom *Field enums.

Regenerated firewall_filter.rs: now has full FirewallFilterRulesRule
(60+ fields including action, direction, gateway, source/dest nets),
FirewallFilterSnatrulesRule, FirewallFilterNptRule,
FirewallFilterOnetooneRule.

New generated modules:
- vip.rs — Virtual IPs (CARP, IP aliases, ProxyARP)
- firewall_alias.rs — Firewall aliases (host, network, port, URL, GeoIP)
- firewall_dnat.rs — Destination NAT / port forwarding rules
2026-03-25 16:00:35 -04:00
2b4c9ac3fb feat(opnsense): VlanScore and LaggScore for network infrastructure
Add VLAN and LAGG management via the OPNsense REST API:

- opnsense-config: vlan.rs and lagg.rs modules with idempotent CRUD
- harmony: VlanScore and LaggScore with OPNSenseFirewall integration
- VlanScore tested end-to-end against OPNsense VM (2 VLANs on vtnet0)
- LaggScore implemented but not VM-testable (needs physical NICs)
- Handle OPNsense select widget fields in VLAN interface responses
- Use direct post_typed calls (addItem/setItem/delItem/reconfigure)
2026-03-25 14:39:30 -04:00
fe22c50122 feat(opnsense): end-to-end validation of all OPNsense Scores
Run LoadBalancerScore, DhcpScore, TftpScore, and NodeExporterScore
against a real OPNsense VM to prove the XML→API migration works.

- Add Router impl for OPNSenseFirewall (gateway + /24 CIDR)
- Fix TFTP/NodeExporter API controller paths (general, not settings)
- Fix TFTP/NodeExporter body wrapper key (general, not module name)
- Fix dnsmasq DHCP range API endpoint (Range, not DhcpRang)
- Fix dnsmasq deserialization for OPNsense select widgets and empty []
- Fix DhcpHostBindingInterpret error propagation (was todo!())
- Expand VM integration example with all 4 Scores + API verification
2026-03-25 14:04:44 -04:00
f8d1f858d0 feat(opnsense): configurable API port, move web GUI to 9443
Add Config::from_credentials_with_api_port() and
OPNSenseFirewall::with_api_port() so the API port is not hardcoded
to 443. This allows running HAProxy on standard ports without
conflicting with the OPNsense web UI.

The integration example now instructs users to change the web GUI
port to 9443 (System > Settings > Administration > TCP Port) as
part of the manual setup, alongside enabling SSH.

The --status command detects whether the API is on 443 or 9443
and advises accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:17:34 -04:00
8a435d2769 docs(opnsense-vm-integration): update README with current status
Document the full workflow, network architecture, manual SSH step,
Docker compatibility, known issues, and future improvements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:07:35 -04:00
095801ac4d fix(opnsense-vm-integration): handle firmware update before package install
When OPNsense is on a base version that needs updating before packages
can install, attempt a firmware update and retry. Use high ports
(16443/18443) for test HAProxy services to avoid conflicting with
the OPNsense web UI on port 443.

Known issue: firmware update on a fresh 26.1 nano image may need
a manual reboot cycle before packages install successfully.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:04:50 -04:00
777213288e fix(opnsense-config): use serde_json::Value for HAProxy config traversal
The hand-written HaproxyGetResponse structs used HashMap which fails
when OPNsense returns [] for empty collections. The generated types
in opnsense-api handle this via opn_map, but opnsense-config had
duplicated structs without that fix.

Replace all hand-written HAProxy response types with serde_json::Value
traversal. This avoids the duplication and handles the []/{} duality.

Also fix integration example:
- Use high ports (16443, 18443) to avoid conflicting with web UI on 443
- Skip package install if already installed
- Use harmony_cli::cli_logger::init() instead of env_logger (safe to
  call multiple times)
- Increase verification timeout to 60s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 11:42:35 -04:00
3fd333caa3 fix(opnsense-vm-integration): detect and fix Docker+libvirt FORWARD conflict
Docker sets iptables FORWARD policy to DROP, which blocks libvirt's
NAT networking (libvirt defaults to nftables which doesn't interact
with Docker's iptables chain).

Fix: setup-libvirt.sh now detects Docker and offers to switch libvirt
to the iptables firewall backend, so both sets of rules coexist.
The --check command warns about this mismatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 11:08:03 -04:00
c2d817180b refactor(opnsense-vm-integration): clean two-phase workflow
Restructure the example into two clear phases:

Phase 1 (--boot): creates KVM network + VM, waits for web UI,
prints instructions for enabling SSH via the OPNsense GUI.

Phase 2 (default run): checks SSH is reachable, creates API key,
installs HAProxy, runs LoadBalancerScore, verifies via API.

The config.xml injection sets vtnet0=LAN (192.168.1.1) and
vtnet1=WAN (DHCP). SSH must be enabled manually in the web UI
because OPNsense has no REST API for SSH management and the
config.xml injection doesn't reliably enable sshd.

Future: use a pre-customized OPNsense image on S3 for CI.

Also add show_ssh_config example to opnsense-api crate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 10:26:23 -04:00
31c3a52750 feat(opnsense): config.xml injection for nano image + dual NIC setup
Add opnsense::image module for customizing OPNsense nano disk images:
- find_config_offset(): scans raw image for config.xml location
- replace_config_xml(): overwrites config with null-padded replacement
- minimal_config_xml(): generates WAN+LAN config for virtio NICs
- Supports auto-scanning for unknown images

KVM improvements:
- disk_from_path(): attach existing disk images (not just new volumes)
- start_vm() now idempotent (skips if already running)
- cdrom uses SATA bus instead of IDE (q35 compatibility)

Integration example updates:
- LAN on 192.168.1.0/24 (matches OPNsense defaults, host reachable)
- WAN on libvirt default network (internet access)
- Config.xml injection replaces em0/em1 with vtnet0/vtnet1
- API key creation via PHP script (writes to file, avoids escaping)

Status: VM boots, web UI responds at 192.168.1.1, interfaces assigned.
Remaining: SSH enablement in config.xml, API key creation, WAN subnet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 09:30:36 -04:00
2e3af21b61 chore(opnsense-vm-integration): add setup-libvirt.sh script
Interactive script that installs packages, adds user to libvirt group,
starts libvirtd, and creates the default storage pool. Asks before
each step (or run with --yes for non-interactive).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:52:10 -04:00
bc1f8e8a9d feat(opnsense-vm-integration): add --check, --setup, --download subcommands
Add prerequisite checking (libvirtd, group membership, storage pool,
bunzip2) with clear error messages and fix suggestions.

Add --setup to print the exact sudo commands needed for initial setup.
Add --download to pre-fetch and decompress the OPNsense nano image.

Full flow: download image → create network with DHCP → boot VM →
discover IP via libvirt lease → wait for API → create API key via
SSH → install HAProxy + Caddy → run LoadBalancerScore → verify.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:45:41 -04:00
7eef3115e9 feat(kvm): add VM IP discovery, DHCP networks, and OPNsense integration example
KVM module enhancements:
- Add vm_ip() and wait_for_ip() to KvmExecutor using
  Domain::interface_addresses() for DHCP IP discovery
- Add DHCP range and static host entries to NetworkConfig/NetworkConfigBuilder
- Generate DHCP XML in network definitions for libvirt's built-in DHCP
- Export DhcpHost type

OPNsense VM integration example (opnsense-vm-integration):
- Boots OPNsense nano VM via KVM
- Discovers IP via libvirt DHCP lease query
- Creates API key via SSH
- Installs HAProxy + Caddy via firmware API
- Runs LoadBalancerScore (2 services: K8s API + HTTPS)
- Verifies HAProxy configuration via API

22 KVM unit tests pass (3 new DHCP tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:37:13 -04:00
d48200b3d5 docs(kvm): document XML template decision and upstream tracking
Explain why we use string templates for libvirt XML generation and
what the path to typed structs looks like. The best candidate is
libvirt-rust-xml (gen branch) which generates Rust structs from
libvirt's RelaxNG schemas via relaxng-gen, but it doesn't compile
yet (virtxml-domain has 6 errors as of baca481).

Also fix dead code in format_cdrom (redundant device_type branch).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:04:53 -04:00
b18c8d534a feat(kvm): add 17 unit tests and VM examples for all infrastructure patterns
Add comprehensive XML generation tests covering: multi-disk VMs,
multi-NIC configurations, MAC addresses, boot order, memory conversion,
sequential disk naming, custom storage pools, NAT/route/isolated
networks, volume sizing, builder defaults, q35 machine type, and
serial console.

Add kvm-vm-examples binary with 5 scenarios:
- alpine: minimal 512MB VM, fast boot for testing
- ubuntu: standard server with 25GB disk
- worker: multi-disk (60G OS + 2x100G Ceph OSD) for storage nodes
- gateway: dual-NIC (WAN NAT + LAN isolated) for firewall/router
- ha-cluster: full 7-VM deployment (gateway + 3 CP + 3 workers)

Each scenario has clean and status subcommands.

19 KVM unit tests pass (17 new + 2 existing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 21:16:08 -04:00
474e5a8dd2 test(opnsense-api): add 11 e2e tests against real OPNsense instance
Add integration tests that verify the full stack against a real OPNsense
VM. Tests are #[ignore]d by default — run with:

  OPNSENSE_TEST_URL=https://10.99.99.1/api \
  OPNSENSE_TEST_KEY=key OPNSENSE_TEST_SECRET=secret \
  cargo test -p opnsense-api --test e2e_test -- --ignored

Tests cover:
- Firmware: status, package list
- Dnsmasq: settings/get, CRUD host lifecycle, add_static_mapping via config
- HAProxy: settings/get, CRUD server, configure_service + idempotency
- VLAN, WireGuard, Firewall: settings/get

Each test cleans up after itself. Do NOT run against production.

Also make DhcpConfigDnsMasq::new and LoadBalancerConfig::new pub for
external test usage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 20:39:54 -04:00
dd92e15f96 test(opnsense-config): restore unit tests with httptest mocks
Add 14 unit tests covering the critical business logic:

Dnsmasq (11 tests):
- add_static_mapping: create new, update by IP, update by hostname,
  hostname/domain splitting, duplicate MAC handling
- Conflict detection: IP/hostname in different entries, multiple matches
- remove_static_mapping: partial remove, full delete, case insensitivity

Load balancer (3 tests):
- configure_service creates all components (healthcheck→server→backend→frontend)
- Idempotent replacement on same bind address (cascade delete then re-create)
- Isolation between services on different bind addresses

Tests use httptest to mock the OPNsense API — no VM or real firewall needed.
All 100 tests pass across the workspace (0 failures).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 20:33:31 -04:00
c608975d30 feat(opnsense-config): replace XML backend with REST API
Replace opnsense-config-xml dependency with opnsense-api. All
configuration CRUD now goes through the OPNsense REST API instead
of SSH + XML editing of /conf/config.xml.

Key changes:
- Config struct holds OpnsenseClient + SSH shell (for file ops only)
- Module handlers (dnsmasq, haproxy, caddy, tftp, node_exporter) are
  now API-backed with async methods
- apply()/save() are no-ops — each module calls reconfigure after mutations
- install_package uses firmware API with polling
- LoadBalancer uses new domain types (LbFrontend, LbBackend, LbServer,
  LbHealthCheck) instead of XML types, with UUID chaining via API
- Dnsmasq conflict detection logic preserved, adapted for API HashMap
- RwLock<Config> replaced with Arc<Config> — Config is now stateless

Benefits over XML approach:
- Per-module soft reload instead of "reload all services"
- Server-side validation of all changes
- No more hash-based race condition detection
- No more fragile XML schema coupling

SSH retained for: file uploads, PXE config writing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 19:35:40 -04:00
6c9472212c docs(opnsense-api): add README with example usage
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 19:07:32 -04:00
bc4dcdf942 feat(opnsense): upgrade to 26.1.5, handle array select widgets
- Pin vendor/core submodule to 26.1.5 tag (matches running firewall)
- Regenerate dnsmasq from model v1.0.9 (migrated during firmware upgrade)
- Handle array-style select widgets in enum deserialization: OPNsense
  sometimes returns [{value, selected}, ...] instead of {key: {value, selected}}
- Add firmware_upgrade and reboot examples for managing OPNsense updates
- All 7 modules validated against live OPNsense 26.1.5:
  dnsmasq, haproxy, caddy, vlan, lagg, wireguard, firewall

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 19:03:04 -04:00
8a7cbf4836 fix(opnsense-codegen): preserve unknown enum values with Other(String)
Replace lossy enum deserialization (unknown variants → None) with
Other(String) catch-all variant. This ensures unknown wire values
survive round-trips: reading an object and POSTing it back will not
silently destroy field values that the codegen doesn't recognize.

This is critical for data integrity — in a read-modify-write cycle,
dropping an unknown enum value would overwrite it with empty on the
next POST.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 18:17:32 -04:00
4af5e7ac19 feat(opnsense): generate types for all 7 modules with codegen fixes
Generate typed API models for HAProxy, Caddy, Firewall, VLAN, LAGG,
WireGuard (client/server/general), and regenerate Dnsmasq. All core
modules validated against a live OPNsense 26.1.2 instance.

Codegen improvements:
- Add --module-name and --api-key CLI flags for controlling output
  filenames and API response envelope keys
- Fix enum variant names starting with digits (prefix with V)
- Use value="" XML attribute for wire values instead of element names
- Handle unknown *Field types as opn_string (select widget safe)
- Forgiving enum deserialization (warn instead of error on unknown)
- Handle empty arrays in opn_string deserializer

Add per-module examples (list_haproxy, list_caddy, list_vlan, etc.)
and utility examples (raw_get, check_package, install_and_wait).
Extract shared client setup into examples/common/mod.rs.

Fix post_typed sending empty JSON body ({}) instead of no body,
which was causing 400 errors on firmware endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 18:11:02 -04:00
0dc2f94b06 feat(opnsense-api): add CRUD methods and common response types
Add entity-level CRUD operations (get_item, add_item, set_item,
del_item, search_items) and service management (reconfigure,
service_status) to OpnsenseClient. These map directly to OPNsense's
MVC controller patterns.

Add response module with UuidResponse, StatusResponse, and
SearchResponse<T> covering the standard OPNsense API response shapes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:35:40 -04:00
eff75f4118 misc: Add test dnsmasq end to end codegen 2026-03-24 15:29:01 -04:00
f28edb3134 feat(opnsense-codegen): codegen now works for dnsmasq end to end from the model to the api 2026-03-24 15:28:00 -04:00
88e6990051 feat(opnsense-api): examples to list packages and dnsmasq settings now working 2026-03-24 14:07:47 -04:00
8e9f8ce405 wip: opnsense-api crate to replace opnsense-config-xml 2026-03-24 13:26:36 -04:00
d87aa3c7e9 fix opnsense sumbodule url 2026-03-24 10:51:38 -04:00
90ec2b524a wip(codegen): generates ir and rust code successfully but not really tested yet 2026-03-24 10:23:52 -04:00
5572f98d5f wip(opnsense-codegen): Can now create IR that looks good from example, successfully parses real models too 2026-03-24 09:32:21 -04:00
8024e0d5c3 wip: opnsense codegen 2026-03-24 07:13:53 -04:00
238e7da175 feat: opnsense codegen basic example scaffolded, now we can start implementing real models 2026-03-23 23:27:40 -04:00
bf84bffd57 wip: config + secret merge with e2e sso examples incoming 2026-03-23 23:26:42 -04:00
d4613e42d3 wip: openbao + zitadel e2e setup and test for harmony_config 2026-03-22 21:27:06 -04:00
6a57361356 chore: Update config roadmap
Some checks failed
Run Check Script / check (pull_request) Failing after 12s
2026-03-22 19:04:16 -04:00
d0d4f15122 feat(config): Example prompting
Some checks failed
Run Check Script / check (pull_request) Failing after 14s
2026-03-22 18:18:57 -04:00
93b83b8161 feat(config): Sqlite storage and example 2026-03-22 17:43:12 -04:00
6ca8663422 wip: Roadmap for config 2026-03-22 16:57:36 -04:00
f6ce0c6d4f chore: Harmony short term roadmap 2026-03-22 11:43:43 -04:00
8a1eca21f7 Merge branch 'feat/harmony_assets' into feature/kvm-module 2026-03-22 11:26:04 -04:00
9d2308eca6 Merge remote-tracking branch 'origin/master' into feature/kvm-module
All checks were successful
Run Check Script / check (pull_request) Successful in 1m48s
2026-03-22 10:02:10 -04:00
ccc26e07eb feat: harmony_asset crate to manage assets, local, s3, http urls, etc
Some checks failed
Run Check Script / check (pull_request) Failing after 17s
2026-03-21 11:10:51 -04:00
8798110bf3 feat: linux vm example with cdrom boot and iso download features
All checks were successful
Run Check Script / check (pull_request) Successful in 1m32s
2026-03-08 21:48:04 -04:00
1508d431c0 refactor: kvm module now efficiently encapsulate libvirt complexity behind builder patterns, no more xml 2026-03-08 12:08:19 -04:00
caf6f0c67b Add KVM module for managing virtual machines
- KVM module with connection configuration (local/SSH)
- VM lifecycle management (create/start/stop/destroy/delete)
- Network management (create/delete isolated virtual networks)
- Volume management (create/delete storage volumes)
- Example: OKD HA cluster deployment with OPNsense firewall
- All VMs configured for PXE boot with isolated network

The KVM module uses virsh command-line tools for management and is fully integrated with Harmony's architecture. It provides a clean Rust API for defining VMs, networks, and volumes. The example demonstrates deploying a complete OKD high-availability cluster (3 control planes, 3 workers) plus OPNsense firewall on an isolated network.
2026-03-08 08:06:10 -04:00