feat/opnsense-codegen-type-safe #257

Merged
johnride merged 21 commits from feat/opnsense-codegen-type-safe into feat/opnsense-codegen 2026-04-09 20:54:49 +00:00
Owner
No description provided.
johnride added 20 commits 2026-04-07 14:26:35 +00:00
The XML parser silently skipped container elements (like <source>,
<destination>) nested inside ArrayField nodes because it only processed
children with a type attribute. This caused generated structs to be
missing nested fields, forcing opnsense-config to use json!() macros
instead of typed structs.

- Add container handling in ArrayField and custom *Field child loops
- Add serialize function to opn_map serde helper (was deserialize-only)
- Change opn_map serde attribute from deserialize_with to with
- Regenerate all 9 model files with the fixes

NatRuleRule now correctly has source/destination/created/updated
container structs with all child fields.
Add two new codegen modules:
- controller_parser: extracts module, controller path, model binding,
  and CRUD actions from OPNsense PHP API controller files using regex
- api_codegen: generates typed Rust API client wrappers from ControllerIR

Generated typed API clients for 6 controllers:
- DNatController → DNatApi (search, get, add, set, del, toggle)
- FilterBaseController → FilterBaseApi (apply, revert, savepoint)
- VlanSettingsController → VlanSettingsApi (CRUD + reconfigure)
- LaggSettingsController → LaggSettingsApi (CRUD + reconfigure)
- VipSettingsController → VipSettingsApi (CRUD + reconfigure)
- Dnsmasq SettingsController → SettingsApi (host/domain CRUD)

These typed APIs eliminate the need for manual json!() construction
and string-based URL paths in opnsense-config modules.
Replace all manual json!() construction and raw Value parsing in
opnsense-config's dnat module with generated typed structs (NatRuleRule,
NatRuleRuleDestination, RuleIpprotocol, RulePass) and the typed DNatApi
client.

Also fix a regression in the parser where the custom *Field handler
trigger condition was too broad, causing InterfaceField/ProtocolField
to be incorrectly treated as ArrayField subclasses. Reverted the trigger
to only match children with type attributes.
Replace manual json!() construction in ensure_filter_rule and
ensure_snat_rule_from with generated typed structs
(FirewallFilterRulesRule, FirewallFilterSnatrulesRule) and their
associated enums for action, direction, ip protocol.

Also generate typed API clients for FilterController, SourceNatController,
and OneToOneController. Add parse_controller_with_defaults for controllers
that inherit model fields from a parent class.

BINAT (one-to-one) ensure still uses json Value as the generated
OnetooneRule struct needs further validation against the wire format.
Rewrite api_codegen to generate proper envelope-wrapping methods that
accept model structs directly. Callers no longer need to manually
construct RuleBody wrappers or extract UUIDs from raw JSON.

Key changes:
- Generated API clients wrap request bodies internally via serde rename
  (e.g., add_rule(&my_rule) serializes as {"rule": {...}})
- Add shared SearchRow type to response.rs with label() and is_enabled()
  helpers, eliminating per-module RuleSearchRow type conflicts
- Extract body_key from PHP controller addBase/setBase calls
- Rewrite dnat.rs and firewall.rs to use the typed API end-to-end:
  search returns SearchResponse<SearchRow>, add returns UuidResponse,
  set/del return StatusResponse — zero raw JSON in production code
- Add EnsureApi trait in firewall.rs for generic find-or-create pattern

The only remaining json!() calls in dnat.rs and firewall.rs are in test
mock responses, which is expected.
Six more modules migrated to typed APIs:
- vlan.rs: uses VlansVlan struct + VlanSettingsApi, reads VlansResponse
- lagg.rs: uses LaggsLagg struct + LaggSettingsApi, reads LaggsResponse
- vip.rs: uses VirtualipVip struct + VipSettingsApi with typed enums
- caddy.rs: typed CaddyGeneralSettings struct instead of json!()
- tftp.rs: typed TftpSettings struct instead of json!()
- node_exporter.rs: typed NodeExporterSettings struct instead of json!()

All six modules now have zero json!() in production code.
Complete the refactoring of all opnsense-config modules:
- dnsmasq.rs: uses DnsmasqHost, DnsmasqDhcpRang, DhcpRangDomainType
  structs + SettingsApi typed client for all CRUD operations
- load_balancer.rs: uses OpNsenseHaProxyServersServer,
  OpNsenseHaProxyBackendsBackend, OpNsenseHaProxyFrontendsFrontend,
  OpNsenseHaProxyHealthchecksHealthcheck structs with correct field
  types (required String vs Option, u16 vs String, bool vs Option)

All 10 opnsense-config modules now have zero json!() in production
code. The only remaining json!() calls are in test mock responses.
- Remove stale FIXME/TODO comments on VlanScore and LaggScore — both
  already use idempotent ensure_* methods that check before creating
- Fix DNAT apply pattern: remove per-rule apply() from ensure_rule(),
  add single apply() call in DnatInterpret::execute() after all rules
  (matching the pattern used by FirewallRuleScore and OutboundNatScore)
- Make DnatConfig::apply() public so callers can batch applies
- Add typed ensure_binat_rule_from() to FirewallFilterConfig, removing
  the last json!() construction in the harmony Score layer
- BinatInterpret now uses typed method instead of manual json!()
Add idempotency verification to the OPNsense VM integration example:
- Extract verify_state() that queries all entity counts via typed API
  (uses DNatApi, FilterApi, SourceNatApi, VipSettingsApi)
- Extract build_all_scores() for reuse across runs
- Run all Scores twice, assert entity counts are unchanged on 2nd run
- This catches duplicate creation in VLAN, LAGG, firewall rules, etc.

Add build/opnsense-e2e.sh — CI-friendly script that:
- Checks prerequisites (libvirtd, user groups)
- Boots OPNsense VM via KVM (idempotent — skips if running)
- Runs full Score suite with idempotency verification
- Supports --download, --clean flags
- Make UuidResponse.uuid default to empty string so validation failures
  ({"result": "failed", "validations": {...}}) don't cause deserialization
  errors. Add is_failed() helper method.
- Fix HAProxy healthcheck construction: map check_type string to
  HealthcheckType enum (was sending empty string, OPNsense rejected it)
- Fix HAProxy server construction: set mode (ServerMode) and type
  (ServerType) enum fields (were defaulting to empty, OPNsense rejected)

Discovered by running E2E tests against real OPNsense VM — the typed
structs with ..Default::default() sent empty strings for required enum
fields, which OPNsense rejected as validation errors.

Still needed: HAProxy backend mode/algorithm and frontend mode/
connectionBehaviour enums, and fixing search API pagination for
filter/snat/vip verification counts.
- Set mode, algorithm, persistence_cookiemode on HAProxy backend struct
  (OPNsense requires these fields, empty string causes validation failure)
- Set mode, connectionBehaviour on HAProxy frontend struct (same issue)
- Switch verify_state() to use rowCount from raw JSON search responses
  instead of typed SearchRow deserialization (more reliable with OPNsense
  search API pagination)

Found by running E2E tests against real OPNsense VM.
The E2E test revealed that OPNsense validation failures were being
silently swallowed: add/set operations returned {"result": "failed",
"validations": {...}} but the code treated them as success.

Critical fixes:
- add_item/set_item now return Error::Validation on failure instead
  of silently returning empty/failed responses
- VLAN: set pcp (PriorityCodePoint) — required in OPNsense 26.1
- Firewall filter: set sequence and statetype (KeepState)
- SNAT: set sequence
- BINAT: set sequence and destination_net ("any")
- DNAT: set sequence
- VIP: default advbase=1 and advskew=0 (required even for IP aliases)
- HAProxy backend: set mode, algorithm, persistence_cookiemode enums
- HAProxy frontend: set mode, connectionBehaviour enums

E2E test now passes: all 11 Scores run successfully against a real
OPNsense VM, and the idempotency test (run twice, verify counts
unchanged) confirms zero duplicates.
- Run cargo fmt across opnsense-api, opnsense-config, opnsense-codegen
  (fixes formatting in generated files and hand-written modules)
- Update examples/opnsense/README.md: replace stale VirtualBox docs
  with current API key + cargo run instructions
- Update examples/opnsense_vm_integration/README.md: document
  idempotency test (run twice, assert zero duplicates), add
  build/opnsense-e2e.sh usage instructions
Split the OPNsense webgui port change from OPNsenseBootstrap into a
proper idempotent Score:

- WebGuiConfigScore reads current port via SSH before changing
- Returns NOOP if already on the target port
- Modifies config.xml via PHP and restarts webgui via configctl
- Runs before LoadBalancerScore to free port 443 for HAProxy

Also:
- Add Config::shell() accessor for SSH access from Scores
- Add WebGuiConfigScore to VM integration example (12 Scores now)
- Document the WebGuiConfig → LoadBalancer ordering dependency
  as a concrete use case in docs/architecture-challenges.md
  (ties into Challenge #2: Runtime Plan & Validation)

The implicit dependency (LoadBalancerScore needs 443 free, which
requires WebGuiConfigScore to run first) remains a convention-based
ordering. This is tracked in architecture-challenges.md alongside
the score_with_dep.rs design sketch.
- Increase VM boot wait from 5 to 10 minutes (cold OPNsense nano first
  boot with filesystem expansion can exceed 5 minutes)
- wait_for_https now tries target port first, then falls back to 443
  on each attempt (handles both fresh VMs on port 443 and already-
  bootstrapped VMs on custom port)
- cargo fmt on network_stress_test and webgui.rs
The firmware update path waited on port 443 after reboot, but the
webgui config persists across reboots (config.xml stays at 9443).
Changed to use OPN_API_PORT which goes through wait_for_https with
port fallback (tries 9443 first, then 443).

Also increase VM resources from 1 vCPU / 1GB to 3 vCPU / 2GB —
cold boot drops from >10 minutes to ~48 seconds.
The web UI responds before the API backend (configd/PHP) is fully
ready after a firmware update reboot. Adding 10s settle time between
web UI detection and package install retry fixes the timeout.

Full --full cold start now completes in ~174 seconds:
  Boot + bootstrap: 48s
  Firmware update + reboot: 60s
  Package installs + 12 Scores x2 + idempotency check: 66s
Code review findings:
- WebGuiConfigScore: write PHP script to temp file via SSH instead of
  inline php -r with shell escaping. Eliminates the shell quoting
  attack surface entirely (port is u16 so was safe, but the pattern
  of format!() into shell commands is a code smell)
- parser.rs: prefix unused parameter with underscore
chore: run clippy and format on network_stress_test crate
Some checks failed
Run Check Script / check (pull_request) Failing after 1m46s
b18974ba6c
johnride added 1 commit 2026-04-09 20:42:11 +00:00
fix(ci): update submodules in ci doing a shallow clone to save some space and download time
All checks were successful
Run Check Script / check (pull_request) Successful in 1m52s
ab78143a45
johnride merged commit 608e8b3b9d into feat/opnsense-codegen 2026-04-09 20:54:49 +00:00
johnride deleted branch feat/opnsense-codegen-type-safe 2026-04-09 20:54:50 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: NationTech/harmony#257
No description provided.