69 lines
3.5 KiB
Markdown
69 lines
3.5 KiB
Markdown
# Architecture Decision Record: Monitoring and Alerting
|
|
|
|
Initial Author : Willem Rolleman
|
|
Date : April 28 2025
|
|
|
|
## Status
|
|
|
|
Proposed
|
|
|
|
## Context
|
|
|
|
A harmony user should be able to initialize a monitoring stack easily, either at the first run of Harmony, or that integrates with existing proects and infra without creating multiple instances of the monitoring stack or overwriting existing alerts/configurations.The user also needs a simple way to configure the stack so that it watches the projects. There should be reasonable defaults configured that are easily customizable for each project
|
|
|
|
## Decision
|
|
|
|
Create MonitoringStack score that creates a maestro to launch the monitoring stack or not if it is already present.
|
|
The MonitoringStack score can be passed to the maestro in the vec! scores list
|
|
|
|
## Rationale
|
|
|
|
Having the score launch a maestro will allow the user to easily create a new monitoring stack and keeps composants grouped together. The MonitoringScore can handle all the logic for adding alerts, ensuring that the stack is running etc.
|
|
|
|
## Alerternatives considered
|
|
|
|
- ### Implement alerting and monitoring stack using existing HelmScore for each project
|
|
- **Pros**:
|
|
- Each project can choose to use the monitoring and alerting stack that they choose
|
|
- Less overhead in terms of care harmony code
|
|
- can add Box::new(grafana::grafanascore(namespace))
|
|
- **Cons**:
|
|
- No default solution implemented
|
|
- Dev needs to chose what they use
|
|
- Increases complexity of score projects
|
|
- Each project will create a new monitoring and alerting instance rather than joining the existing one
|
|
|
|
|
|
- ### Use OKD grafana and prometheus
|
|
- **Pros**:
|
|
- Minimal config to do in Harmony
|
|
- **Cons**:
|
|
- relies on OKD so will not working for local testing via k3d
|
|
|
|
- ### Create a monitoring and alerting crate similar to harmony tui
|
|
- **Pros**:
|
|
- Creates a default solution that can be implemented once by harmony
|
|
- can create a join function that will allow a project to connect to the existing solution
|
|
- eliminates risk of creating multiple instances of grafana or prometheus
|
|
- **Cons**:
|
|
- more complex than using a helm score
|
|
- management of values files for individual functions becomes more complicated, ie how do you create alerts for one project via helm install that doesnt overwrite the other alerts
|
|
|
|
- ### Add monitoring to Maestro struct so whether the monitoring stack is used must be defined
|
|
- **Pros**:
|
|
- less for the user to define
|
|
- may be easier to set defaults
|
|
- **Cons**:
|
|
- feels counterintuitive
|
|
- would need to modify the structure of the maestro and how it operates which seems like a bad idea
|
|
- unclear how to allow user to pass custom values/configs to the monitoring stack for subsequent projects
|
|
|
|
- ### Create MonitoringStack score to add to scores vec! which loads a maestro to install stack if not ready or add custom endpoints/alerts to existing stack
|
|
- **Pros**:
|
|
- Maestro already accepts a list of scores to initialize
|
|
- leaving out the monitoring score simply means the user does not want monitoring
|
|
- if the monitoring stack is already created, the MonitoringStack score doesn't necessarily need to be added to each project
|
|
- composants of the monitoring stack are bundled together and can be expaned or modified from the same place
|
|
- **Cons**:
|
|
- maybe need to create
|