fix(opnsense): valid HAProxy config + From<&str> codegen cleanup #273

Merged
stremblay merged 7 commits from fix/haproxy-issues into master 2026-04-22 17:01:54 +00:00
Owner

Summary

Three fixes and a codegen refactor. Together they take LoadBalancerScore (and OKDLoadBalancerScore) from "silently generates a broken HAProxy
config" to "deploys a working HAProxy end-to-end on OPNsense" — and cleans up the call-site-by-call-site string-to-enum boilerplate the fixes
originally added.

1. is_installed no longer swallows errors (83d9af2)

LoadBalancerConfig::is_installed returned bool via .is_ok(), so a timeout, DNS failure, or auth rejection all collapsed to "not installed"
— the score would then try to install os-haproxy on top of an unreachable firewall and fail in cascade further down the pipeline.

Now returns Result<bool, Error>, treating only HTTP 404 ("controller not found") as "not installed" and propagating every other error so the
score fails fast with a clear message.

2. HAProxy healthcheck/server fields set explicitly (21035d2)

configure_service used ..Default::default() for most fields of the generated HAProxy structs, letting OPNsense's model defaults leak
through:

  • http_hostlocalhost (every check sent Host: localhost)
  • http_methodoptions (OPTIONS instead of the declared method)
  • http_versionhttp10
  • sslVerify on real servers → 1 (broke self-signed backends)
  • Healthcheck ssl was never propagated, so SSL-required checks like kube-apiserver /readyz on 6443 stayed plain HTTP and never reached the
    TLS listener.

Every field is now mapped from LbHealthCheck / LbServer explicitly: method via HealthcheckHttpMethod, version pinned to None (wire "" =
NONE), http_host cleared to an empty string, ssl propagated through HealthcheckSsl, ssl/sslVerify pinned to false at the call site.

3. From<&str> for every generated enum (fc16e9f, ead76e7)

The fix in §2 introduced eight string-match blocks in configure_service that duplicated wire-value knowledge the codegen already has. Addressed in review.

  • opnsense-codegen/src/codegen.rs::generate_enum now emits impl From<&str> and impl From<String> for every generated enum, right after the existing serde module. Lowercase-matches wire values (both sides lowercased so mixed-case wire
    values like "SSLv3", "CONNECT" work); unknown inputs fall through to the existing Other(String) variant that the codegen emits for forward-compat round-tripping.
  • opnsense-api/src/generated/haproxy.rs regenerated — 153 enums × 2 = 306 new impl blocks. No hand edits.
    cargo run -p opnsense-codegen -- generate
    --xml opnsense-codegen/vendor/plugins/net/haproxy/src/opnsense/mvc/app/models/OPNsense/HAProxy/HAProxy.xml
    --output-dir opnsense-api/src/generated
    --module-name haproxy
    --api-key haproxy
    cargo fmt -p opnsense-api
  • opnsense-config/src/modules/load_balancer.rs::configure_service drops eight string-match blocks in favor of one-liners: HealthcheckType::from(hc.check_type.as_str()), hc.http_method.as_deref().map(HealthcheckHttpMethod::from), etc.
  • Drive-by typo fix in harmony/src/infra/opnsense/load_balancer.rsSSL::SNI was mapped to wire value "sslni", but OPNsense's wire value is "sslsni" (both forward and reverse directions fixed, lines 149 and 185). Before this refactor
    the typo silently hit HealthcheckSsl::Other("sslni"); the cleaner conversion made the bug obvious.

Environment note: HAProxy fails to start on firewalls with the HTTP→HTTPS redirect enabled

If HAProxy refuses to start with cannot bind socket (Address already in use) for [0.0.0.0:80], check whether OPNsense's built-in HTTP→HTTPS redirect is enabled (System → Settings → Administration). It runs lighttpd dual-stacked on
[::]:80, which via v4-mapped addresses also owns IPv4 port 80 — invisible to sockstat -l4 but enough to reject HAProxy's bind. Either disable the redirect or make sure nothing else holds *:80 before applying the score. Not fixed in Harmony;
it's an OPNsense-side knob.

An earlier attempt (5e72777) switched the bind to the firewall's LAN IP to work around this, but was reverted in a196268 after confirming the root cause was environment-specific rather than a general incompatibility. 0.0.0.0 binds work
fine on any OPNsense where nothing else is squatting on the ports.

## Summary Three fixes and a codegen refactor. Together they take `LoadBalancerScore` (and `OKDLoadBalancerScore`) from "silently generates a broken HAProxy config" to "deploys a working HAProxy end-to-end on OPNsense" — and cleans up the call-site-by-call-site string-to-enum boilerplate the fixes originally added. ### 1. `is_installed` no longer swallows errors (`83d9af2`) `LoadBalancerConfig::is_installed` returned `bool` via `.is_ok()`, so a timeout, DNS failure, or auth rejection all collapsed to "not installed" — the score would then try to install `os-haproxy` on top of an unreachable firewall and fail in cascade further down the pipeline. Now returns `Result<bool, Error>`, treating only HTTP 404 ("controller not found") as "not installed" and propagating every other error so the score fails fast with a clear message. ### 2. HAProxy healthcheck/server fields set explicitly (`21035d2`) `configure_service` used `..Default::default()` for most fields of the generated HAProxy structs, letting OPNsense's *model defaults* leak through: - `http_host` → `localhost` (every check sent `Host: localhost`) - `http_method` → `options` (OPTIONS instead of the declared method) - `http_version` → `http10` - `sslVerify` on real servers → `1` (broke self-signed backends) - Healthcheck `ssl` was never propagated, so SSL-required checks like kube-apiserver `/readyz` on 6443 stayed plain HTTP and never reached the TLS listener. Every field is now mapped from `LbHealthCheck` / `LbServer` explicitly: method via `HealthcheckHttpMethod`, version pinned to `None` (wire `""` = NONE), `http_host` cleared to an empty string, `ssl` propagated through `HealthcheckSsl`, `ssl`/`sslVerify` pinned to `false` at the call site. ### 3. `From<&str>` for every generated enum (`fc16e9f`, `ead76e7`) The fix in §2 introduced eight string-match blocks in `configure_service` that duplicated wire-value knowledge the codegen already has. Addressed in review. - **`opnsense-codegen/src/codegen.rs::generate_enum`** now emits `impl From<&str>` and `impl From<String>` for every generated enum, right after the existing serde module. Lowercase-matches wire values (both sides lowercased so mixed-case wire values like `"SSLv3"`, `"CONNECT"` work); unknown inputs fall through to the existing `Other(String)` variant that the codegen emits for forward-compat round-tripping. - **`opnsense-api/src/generated/haproxy.rs`** regenerated — 153 enums × 2 = 306 new impl blocks. No hand edits. cargo run -p opnsense-codegen -- generate --xml opnsense-codegen/vendor/plugins/net/haproxy/src/opnsense/mvc/app/models/OPNsense/HAProxy/HAProxy.xml --output-dir opnsense-api/src/generated --module-name haproxy --api-key haproxy cargo fmt -p opnsense-api - **`opnsense-config/src/modules/load_balancer.rs::configure_service`** drops eight string-match blocks in favor of one-liners: `HealthcheckType::from(hc.check_type.as_str())`, `hc.http_method.as_deref().map(HealthcheckHttpMethod::from)`, etc. - **Drive-by typo fix** in `harmony/src/infra/opnsense/load_balancer.rs` — `SSL::SNI` was mapped to wire value `"sslni"`, but OPNsense's wire value is `"sslsni"` (both forward and reverse directions fixed, lines 149 and 185). Before this refactor the typo silently hit `HealthcheckSsl::Other("sslni")`; the cleaner conversion made the bug obvious. ## Environment note: HAProxy fails to start on firewalls with the HTTP→HTTPS redirect enabled If HAProxy refuses to start with `cannot bind socket (Address already in use) for [0.0.0.0:80]`, check whether **OPNsense's built-in HTTP→HTTPS redirect** is enabled (*System → Settings → Administration*). It runs lighttpd dual-stacked on `[::]:80`, which via v4-mapped addresses also owns IPv4 port 80 — invisible to `sockstat -l4` but enough to reject HAProxy's bind. Either disable the redirect or make sure nothing else holds `*:80` before applying the score. Not fixed in Harmony; it's an OPNsense-side knob. An earlier attempt (`5e72777`) switched the bind to the firewall's LAN IP to work around this, but was **reverted** in `a196268` after confirming the root cause was environment-specific rather than a general incompatibility. `0.0.0.0` binds work fine on any OPNsense where nothing else is squatting on the ports.
stremblay added 3 commits 2026-04-22 15:15:20 +00:00
`LoadBalancerConfig::is_installed` previously collapsed every error from
the settings endpoint into `false`, so a timeout, DNS failure, or auth
rejection all looked identical to "os-haproxy not installed" — the
`LoadBalancer` score would then attempt to install the plugin on top of
an unreachable firewall and fail in cascade further down the pipeline.

Return `Result<bool, Error>` and treat only HTTP 404 (controller not
found) as "not installed". Every other error is propagated so
`ensure_initialized` fails the score immediately with a message pointing
at the real problem.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`configure_service` was relying on `..Default::default()` for most fields
of the generated HAProxy structs. That leaked OPNsense's *model defaults*
into the wire payload for fields Harmony never meant to default:

- `http_host` → `localhost` (sent `Host: localhost` on every check)
- `http_method` → `options` (sent OPTIONS instead of the declared method)
- `http_version` → `http10` (wanted NONE)
- `sslVerify` on real servers → `1` (broke self-signed backends)
- Healthcheck `ssl` was never propagated, so SSL-required checks like
  kube-apiserver `/readyz` on 6443 stayed plain HTTP and never succeeded

Set every field explicitly from `LbHealthCheck`/`LbServer`: map
`http_method` through `HealthcheckHttpMethod`, pass `None` for
`http_version` (serializes as `""` = NONE), clear `http_host` to an empty
string, propagate `hc.ssl` through `HealthcheckSsl`, and pin
`ssl`/`sslVerify` to `false` on the server struct so intent is declared
at the call site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(okd): bind load balancer services on firewall IP, not 0.0.0.0
Some checks failed
Run Check Script / check (pull_request) Failing after 56s
5e72777c15
Binding HAProxy on 0.0.0.0 collided with OPNsense's own listeners
(HTTP→HTTPS redirect on :80, WebUI, etc.), preventing the HAProxy
service from starting once the LoadBalancer score was applied.

Use `topology.load_balancer.get_ip()` to bind each frontend on the
firewall's LAN interface IP instead. The `LoadBalancer` capability was
already in scope, so no new trait imports are needed.

The previous `0.0.0.0` rationale (avoiding CARP VIP rebind races) is
noted in a comment: HA CARP setups still need OPNsense's
`net.inet.ip.nonlocal_bind` or HAProxy `transparent` bind — not
addressed here.

Test module: added an inline `DummyLoadBalancer` stub (mirrors the
existing `DummyRouter` pattern) so `OKDLoadBalancerScore::new` no longer
hits `DummyInfra::get_ip`'s `unimplemented!()` panic. Renamed
`test_all_services_bind_on_unspecified_address` →
`test_all_services_bind_on_firewall_ip`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stremblay added 1 commit 2026-04-22 15:29:38 +00:00
fix: formatting
All checks were successful
Run Check Script / check (pull_request) Successful in 2m19s
5a17bc229e
stremblay added 1 commit 2026-04-22 16:11:01 +00:00
revert(okd): bind load balancer on 0.0.0.0 again
All checks were successful
Run Check Script / check (pull_request) Successful in 2m17s
a196268c1e
Reverting 5e72777. The HAProxy startup failure that motivated the
bind-to-FW-IP change was environment-specific on the sttest basement
firewall: OPNsense's "HTTP → HTTPS redirect" service (lighttpd bound to
`[::]:80`, dual-stack) was holding IPv4 port 80 via v4-mapped addresses
— invisible in `sockstat -l4` but still enough to make `0.0.0.0:80`
return EADDRINUSE to HAProxy.

Disabling the HTTP redirect on that firewall resolves the conflict.
Other OPNsense deployments already ship with the redirect off (or
HAProxy on non-conflicting ports), so `0.0.0.0` remains the correct
default.

This reverts commit 5e72777.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
johnride requested changes 2026-04-22 16:14:43 +00:00
@@ -170,0 +183,4 @@
.http_method
.as_deref()
.map(|m| match m.to_lowercase().as_str() {
"options" => HealthcheckHttpMethod::OptionsDefault,
Owner

Instead of doing those ugly matches that depend on strings that we don't control here, implement From &str traits for those structs HealthCheckHttpMethod and the other similar above and below HealthcheckType and HealthcheckSsl and probably others around here. This is poor rust, feels like crappy python by a first year programmer. Do better.

Instead of doing those ugly matches that depend on strings that we don't control here, implement `From &str` traits for those structs HealthCheckHttpMethod and the other similar above and below HealthcheckType and HealthcheckSsl and probably others around here. This is poor rust, feels like crappy python by a first year programmer. Do better.
stremblay marked this conversation as resolved
stremblay added 1 commit 2026-04-22 16:33:23 +00:00
refactor(opnsense): use From<&str> for wire-value conversions
Some checks failed
Run Check Script / check (pull_request) Failing after 54s
fc16e9fac9
Addresses review feedback on the previous HAProxy field-default fixes:
the eight match blocks in `configure_service` that mapped loose strings
("get", "tcp", "roundrobin", ...) to generated OPNsense enum variants
were poor Rust — they duplicated the wire-value knowledge that the
codegen already has, and any new enum variant in OPNsense meant editing
every call site by hand.

- `opnsense-codegen/src/codegen.rs::generate_enum` now emits
  `impl From<&str>` and `impl From<String>` for every generated enum,
  right after the existing serde module. Lowercase-matches wire values;
  unknown inputs fall through to the `Other(String)` variant the codegen
  already emits for forward-compat round-tripping.
- `opnsense-api/src/generated/haproxy.rs` regenerated — 153 enums, 306
  new impl blocks. No hand edits; re-run via
  `cargo run -p opnsense-codegen -- generate --xml
  opnsense-codegen/vendor/plugins/net/haproxy/src/opnsense/mvc/app/models/OPNsense/HAProxy/HAProxy.xml
  --output-dir opnsense-api/src/generated --module-name haproxy`.
- `opnsense-config/src/modules/load_balancer.rs::configure_service`
  replaces eight string-match blocks with one-liners:
  `HealthcheckType::from(hc.check_type.as_str())` etc.
- Drive-by: fixed a pre-existing typo at
  `harmony/src/infra/opnsense/load_balancer.rs:185` and the matching
  reverse at `:149` — `SSL::SNI` was mapped to `"sslni"`, but the
  OPNsense wire value is `"sslsni"`. Before this refactor the typo
  silently hit `HealthcheckSsl::Other("sslni")`; the cleaner conversion
  made the bug obvious so it's fixed here rather than left for a
  follow-up.

Verification:
- `cargo check -p harmony -p opnsense-config -p opnsense-api` clean
- `cargo test -p harmony --lib okd::load_balancer` 6/6 pass
- `cargo test -p opnsense-codegen` 22/22 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stremblay added 1 commit 2026-04-22 16:54:33 +00:00
fix(opnsense): lowercase match arms in generated From<&str>
All checks were successful
Run Check Script / check (pull_request) Successful in 2m6s
ead76e710f
Two regressions from fc16e9f that ./build/check.sh catches:

1. `opnsense-api`'s `test_haproxy_deser` example references
   `resp.haproxy` on the response wrapper. The regen auto-derived the
   field name as `op_nsenseha_proxy` from the struct name. Need to pass
   `--api-key haproxy` to keep the wrapper key stable.

2. For enums whose wire values aren't all-lowercase (e.g. `"SSLv3"`,
   `"CONNECT"`), the emitted `From<&str>` matched `s.to_lowercase()`
   against the original-case wire value, which clippy flags as
   unreachable ("match arm has differing case"). Lowercase the wire
   value in the emitted match arm so case-insensitive matching actually
   works; serialization still emits the original-case wire value
   because the serde module is unaffected.

Regenerated `haproxy.rs` via
`cargo run -p opnsense-codegen -- generate --xml ... --module-name haproxy --api-key haproxy`.

`./build/check.sh` now passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stremblay changed title from fix(haproxy): make LoadBalancer score work end-to-end on OPNsense to fix(opnsense): valid HAProxy config + From<&str> codegen cleanup 2026-04-22 16:58:16 +00:00
stremblay merged commit be4b9acaad into master 2026-04-22 17:01:54 +00:00
stremblay deleted branch fix/haproxy-issues 2026-04-22 17:01:54 +00:00
johnride reviewed 2026-04-22 17:38:26 +00:00
@@ -249,3 +238,1 @@
"uri" => BackendAlgorithm::UriHashOnlyHttpMode,
other => BackendAlgorithm::Other(other.to_string()),
}),
mode: Some(BackendMode::from(backend.mode.as_str())),
Owner

MUCH better but still dumb. Why do we have to do mode from mode.as_str() ???? Two different types? There should be only one type, I'd rather pull the type from opnsense-api than have two types for no reason. If there is a major reason then implement a type1::from(type2) so we are type safe, just boilerplate, but safe boilerplate is better than passing strings around.

MUCH better but still dumb. Why do we have to do mode from mode.as_str() ???? Two different types? There should be only one type, I'd rather pull the type from opnsense-api than have two types for no reason. If there is a major reason then implement a type1::from(type2) so we are type safe, just boilerplate, but safe boilerplate is better than passing strings around.
johnride reviewed 2026-04-22 17:41:02 +00:00
johnride left a comment
Owner

types??

types??
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: NationTech/harmony#273
No description provided.