update adr

This commit is contained in:
Willem 2025-04-28 15:09:11 -04:00
parent 20551b4a80
commit db9c8d83e6

View File

@ -9,15 +9,16 @@ Proposed
## Context
Currently our monitoring and alerting is done using grafana and prometheus alert manager, deployed via helm in k8s. We need to implement a monitoring and alerting solution that is managed by Harmony. A decision needs to be made as to how this should be implemented within Harmony.
A harmony user should be able to initialize a monitoring stack easily, either at the first run of Harmony, or that integrates with existing proects and infra without creating multiple instances of the monitoring stack or overwriting existing alerts/configurations.The user also needs a simple way to configure the stack so that it watches the projects. There should be reasonable defaults configured that are easily customizable for each project
## Decision
use existing HelmScore and pass the scores for grafana and prometheus for each individual project
Create MonitoringStack score that creates a maestro to launch the monitoring stack or not if it is already present.
The MonitoringStack score can be passed to the maestro in the vec! scores list
## Rationale
This will allow the end user to choose to use the monitoring and alerting stack if they choose for both local as well as dev/prod projects. Grafana and Prometheus are installed via helm which is consitent with OKD, helm and other design choices. Allows the use of already defined Scores.
Having the score launch a maestro will allow the user to easily create a new monitoring stack and keeps composants grouped together. The MonitoringScore can handle all the logic for adding alerts, ensuring that the stack is running etc.
## Alerternatives considered
@ -30,6 +31,8 @@ This will allow the end user to choose to use the monitoring and alerting stack
- No default solution implemented
- Dev needs to chose what they use
- Increases complexity of score projects
- Each project will create a new monitoring and alerting instance rather than joining the existing one
- ### Use OKD grafana and prometheus
- **Pros**:
@ -39,8 +42,27 @@ This will allow the end user to choose to use the monitoring and alerting stack
- ### Create a monitoring and alerting crate similar to harmony tui
- **Pros**:
- Creates a default solution that can be implemented or not depending on user choice
- Creates a default solution that can be implemented once by harmony
- can create a join function that will allow a project to connect to the existing solution
- eliminates risk of creating multiple instances of grafana or prometheus
- **Cons**:
- more complex than using a helm score
- management of values files for individual functions becomes more complicated, ie how do you create alerts for one project via helm install that doesnt overwrite the other alerts
- ### Add monitoring to Maestro struct so whether the monitoring stack is used must be defined
- **Pros**:
- less for the user to define
- may be easier to set defaults
- **Cons**:
- feels counterintuitive
- would need to modify the structure of the maestro and how it operates which seems like a bad idea
- unclear how to allow user to pass custom values/configs to the monitoring stack for subsequent projects
- ### Create MonitoringStack score to add to scores vec! which loads a maestro to install stack if not ready or add custom endpoints/alerts to existing stack
- **Pros**:
- Maestro already accepts a list of scores to initialize
- leaving out the monitoring score simply means the user does not want monitoring
- if the monitoring stack is already created, the MonitoringStack score doesn't necessarily need to be added to each project
- composants of the monitoring stack are bundled together and can be expaned or modified from the same place
- **Cons**:
- maybe need to create