RFC : Harmony agent versionning strategy #206

New Issue

johnride · 2026-01-08T19:22:31Z

johnride commented

2026-01-08 19:22:31 +00:00

When working on the design of the automated failover topology, we hit an interesting question : how do we manage configuration?

So far, one of Harmony's main goal is to do everything as code, with an IDE, in Rust.

Configuration such as what site should be the primary or the replica is something that we want to be managed by Harmony itself as it will be deploying both sites and configuring everything. This means that we want to use code to configure which is which.

Then, at runtime, we have to manage updates to the configuration. What if we have to use a new replica site? What if the TTL to demote the primary in an outage even has to change?

There are two ways I see right now to handle this :

The usual way : as a configuration value stored in the database (nats) that can be updated and will be propagated automatically in real time
The everything as code way : in nats we only store the version of the deployment binary that is running on primary/replica and the binary itself contains the configuration

I guess a third way would be a hybrid approach, where some parameters are stored in a database and some other in the code.

As of now, I am leaning towards the second approach as the most robust, and really not that hard to maintain. However, for an efficient management, I think this would require running an harmony operator that watches either CRDs or Nats to always keep the harmony deployment binaries up to date.

This is my initial thoughts on the topic, any comment or additional idea is appreciated!

When working on the design of the automated failover topology, we hit an interesting question : how do we manage configuration? So far, one of Harmony's main goal is to do everything as code, with an IDE, in Rust. Configuration such as what site should be the primary or the replica is something that we want to be managed by Harmony itself as it will be deploying both sites and configuring everything. This means that we want to use code to configure which is which. Then, at runtime, we have to manage updates to the configuration. What if we have to use a new replica site? What if the TTL to demote the primary in an outage even has to change? There are two ways I see right now to handle this : - The usual way : as a configuration value stored in the database (nats) that can be updated and will be propagated automatically in real time - The everything as code way : in nats we only store the version of the deployment binary that is running on primary/replica and the binary itself contains the configuration I guess a third way would be a hybrid approach, where some parameters are stored in a database and some other in the code. As of now, I am leaning towards the second approach as the most robust, and really not that hard to maintain. However, for an efficient management, I think this would require running an harmony operator that watches either CRDs or Nats to always keep the harmony deployment binaries up to date. This is my initial thoughts on the topic, any comment or additional idea is appreciated!

Sign in to join this conversation.

Branches Tags

master

feat/fleet-ch2-operator-recovery

feat/fleet-device-exec-logs

feat/zitadel-web-pkce-and-human-user

feat/jwt-bearer-openbao-auth

feat/fleet-ch5-graceful-deploy-upgrade

feat/fleet-ch4-agent-upgrade

feat/fleet-ch3-log-streaming

feat/add-claims-for-openbao

refactor/move-zitadel-jwt-to-module

feat/fleet-operator-real-data

docs/fleet-secrets-device-access

chore/fleet-operator-prune-mock-dtos

chore/rename-release-to-publish

refactor/config-namespace-env-var

feat/fleet-staging-openbao

feat/auth-add-next-url-redirect

pr/harmony-sso-example

feat/unified-config-and-secrets

ci/fleet-argo-cd

ci/fleet-operator-release-pipeline

feat/on-device-key-gen

feat/install-gitea

feat/v0-3-logs-companion

refactor/smoke-companion-minimal

feat/smoke-test-contract

feat/iobench-redpanda-profile

feat/v0-3-dashboard-role-enforcement

feat/v0-3-init-containers

feat/v0-3-operator-restart-baseline

feat/fleet-e2e-x86

feat/ceph-score

feat/opnsense-bootstrap-score

feat/fleet-e2e

feat/fleet-e2e-harness-and-ping

feat/dashboard-auth

feat/fleet-operator-web-frontend

feat/deploy_fleet_server_side

feat/openwebui

feat/iot-aggregation-scale

feat/iot-operator-helm-chart

feat/removesideeffect

feat/test-alert-receivers-sttest

feat/brocade-client-add-vlans

feat/agent-desired-state

feat/opnsense-dns-implementation

feat/named-config-instances

worktree-bridge-cse_012j1jB37XfjXvDGHUjHrKSj

chore/leftover-adr

feat/config_e2e_zitadel_openbao

example/vllm

feat/config_sqlite

chore/roadmap

feature/kvm-module

feat/rustfs

feat/harmony_assets

feat/brocade_assisted_setup

feat/cluster_alerting_score

e2e-tests-multicluster

fix/refactor_alert_receivers

feat/change-node-readiness-strategy

feat/zitadel

feat/improve-inventory-discovery

fix/monitoring_abstractions_openshift

feat/nats-jetstream

adr-nats-creds

feat/st_test

feat/dockerAutoinstall

chore/cleanup_hacluster

doc/cert-management

feat/certificate_management

adr/017-staleness-failover

fix/nats_non_root

feat/rebuild_inventory

fix/opnsense_update

feat/unshedulable_control_planes

feat/worker_okd_install

doc-and-braindump

fix/pxe_install

switch-client

okd_enable_user_workload_monitoring

configure-switch

fix/clippy

feat/gen-ca-cert

feat/okd_default_ingress_class

fix/add_routes_to_domain

secrets-prompt-editor

feat/multisiteApplication

feat/ceph-install-score

feat/ceph-osd-score

feat/ceph_validate_health

better-indicatif-progress-grouped

feat/crd-alertmanager-configs

better-cli

opnsense_upgrade

feat/monitoring-application-feature

dev/postgres

feat/cd/localdeploymentdemo

feat/webhook_receiver

feat/kube-prometheus

feat/init_k8s_tenant

feat/discord-webhook-receiver

feat/kube-prometheus-monitor

feat/tenantScore

feat/teams-integration

feat/slack-notifs

monitoring

runtime-profiles

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: NationTech/harmony#206