All checks were successful
Run Check Script / check (pull_request) Successful in 2m35s
Operator restart + aggregator recovery (v0.3 plan Ch2). The aggregator already cold-rebuilds from NATS KV + CR watches; this makes recovery observable, closes an orphan gap, and pins each failure shape with a regression test. - OperatorLiveness: a shared in-process latch (Recovering → Converged) the aggregator sets once all three cold-start sources replay (Deployment/Device watcher InitDone, device-state KV seen_current; empty-bucket short-circuit). The in-process dashboard reads it and shows a self-clearing banner via an HTMX self-poll (/__recovery), so the customer sees progress, not a blank. - gc_orphaned_desired_state: at convergence, purge desired-state whose Deployment CR no longer exists (force-deleted while the operator was down, finalizer bypassed). Belt-and-suspenders with the controller finalizer. - run() now owns its watchers in a JoinSet, so cancelling the aggregator aborts its children — no orphan tasks outliving a restart (matters for the restart-simulation tests and clean process teardown). Also made run() Send (hoisted a .await out of a tracing macro) so it can be spawned. - docs/fleet-operator-recovery-scenarios.md enumerates the failure shapes and maps each to its test. - harmony-fleet-e2e/tests/operator_recovery.rs: regression test per scenario (cold restart converges from KV; orphan GC; two operators write identical bytes; chaos kill under write load converges <30s) + AdminKv::put_device_state. Writes stay idempotent + byte-deterministic, so two operators racing agree without leader election (operator HA = D3, deferred).
64 lines
1.9 KiB
TOML
64 lines
1.9 KiB
TOML
[package]
|
|
name = "harmony-fleet-e2e"
|
|
edition = "2024"
|
|
version.workspace = true
|
|
license.workspace = true
|
|
description = "In-cluster e2e harness for the fleet stack: brings up NATS + agent (and later: callout + operator + mock-OIDC) in a fresh k3d namespace per test run."
|
|
|
|
# Library + integration tests. No bin. Consumers are the integration
|
|
# tests in `tests/` plus future callers (the slimmed-down fleet_e2e_demo).
|
|
[lib]
|
|
path = "src/lib.rs"
|
|
|
|
[[test]]
|
|
name = "ping"
|
|
path = "tests/ping.rs"
|
|
|
|
[[test]]
|
|
name = "operator"
|
|
path = "tests/operator.rs"
|
|
|
|
[[test]]
|
|
name = "vm_ping"
|
|
path = "tests/vm_ping.rs"
|
|
|
|
[[test]]
|
|
name = "vm_isolation"
|
|
path = "tests/vm_isolation.rs"
|
|
|
|
[[test]]
|
|
name = "vm_deploy_lifecycle"
|
|
path = "tests/vm_deploy_lifecycle.rs"
|
|
|
|
[dependencies]
|
|
# `kvm` for the VM-side harness (ProvisionVmScore + KvmVirtualMachineHost),
|
|
# `podman` for `ReconcileScore`/`PodmanV0Score` the kv_admin helpers
|
|
# serialize into the desired-state bucket.
|
|
harmony = { path = "../../harmony", features = ["kvm", "podman"] }
|
|
harmony-fleet-auth = { path = "../harmony-fleet-auth" }
|
|
harmony-fleet-deploy = { path = "../harmony-fleet-deploy" }
|
|
harmony-fleet-operator = { path = "../harmony-fleet-operator" }
|
|
harmony-reconciler-contracts = { path = "../../harmony-reconciler-contracts" }
|
|
harmony_types = { path = "../../harmony_types" }
|
|
k3d-rs = { path = "../../k3d" }
|
|
|
|
anyhow = { workspace = true }
|
|
async-nats = { workspace = true }
|
|
async-trait = { workspace = true }
|
|
futures-util = { workspace = true }
|
|
k8s-openapi = { workspace = true }
|
|
kube = { workspace = true, features = ["runtime", "derive"] }
|
|
serde = { workspace = true }
|
|
serde_json = { workspace = true }
|
|
tempfile = "3"
|
|
thiserror = { workspace = true }
|
|
tokio = { workspace = true, features = ["full"] }
|
|
tracing = { workspace = true }
|
|
tracing-subscriber = { workspace = true }
|
|
directories = "6.0.0"
|
|
uuid = { version = "1", features = ["v4"] }
|
|
|
|
[dev-dependencies]
|
|
tokio = { workspace = true, features = ["full"] }
|
|
chrono = { workspace = true }
|