diff --git a/adr/010-monitoring-and-alerting.md b/adr/010-monitoring-and-alerting.md new file mode 100644 index 0000000..d5ebb10 --- /dev/null +++ b/adr/010-monitoring-and-alerting.md @@ -0,0 +1,46 @@ +# Architecture Decision Record: Monitoring and Alerting + +Proposed by: Willem Rolleman +Date: April 28 2025 + +## Status + +Proposed + +## Context + +Currently our monitoring and alerting is done using grafana and prometheus alert manager, deployed via helm in k8s. We need to implement a monitoring and alerting solution that is managed by Harmony. A decision needs to be made as to how this should be implemented within Harmony. + +## Decision + +use existing HelmScore and pass the scores for grafana and prometheus for each individual project + +## Rationale + +This will allow the end user to choose to use the monitoring and alerting stack if they choose for both local as well as dev/prod projects. Grafana and Prometheus are installed via helm which is consitent with OKD, helm and other design choices. Allows the use of already defined Scores. + +## Alerternatives considered + +- ### Implement alerting and monitoring stack using existing HelmScore for each project + - **Pros**: + - Each project can choose to use the monitoring and alerting stack that they choose + - Less overhead in terms of care harmony code + - can add Box::new(grafana::grafanascore(namespace)) + - **Cons**: + - No default solution implemented + - Dev needs to chose what they use + - Increases complexity of score projects + +- ### Use OKD grafana and prometheus + - **Pros**: + - Minimal config to do in Harmony + - **Cons**: + - relies on OKD so will not working for local testing via k3d + +- ### Create a monitoring and alerting crate similar to harmony tui + - **Pros**: + - Creates a default solution that can be implemented or not depending on user choice + - **Cons**: + - more complex than using a helm score + +