update adr

2025-04-28 15:09:11 -04:00
parent 20551b4a80
commit db9c8d83e6
1 changed files with 26 additions and 4 deletions
--- a/adr/010-monitoring-and-alerting.md
+++ b/adr/010-monitoring-and-alerting.md
@@ -9,15 +9,16 @@ Proposed

 ## Context

-Currently our monitoring and alerting is done using grafana and prometheus alert manager, deployed via helm in k8s. We need to implement a monitoring and alerting solution that is managed by Harmony. A decision needs to be made as to how this should be implemented within Harmony.
+A harmony user should be able to initialize a monitoring stack easily, either at the first run of Harmony, or that integrates with existing proects and infra without creating multiple instances of the monitoring stack or overwriting existing alerts/configurations.The user also needs a simple way to configure the stack so that it watches the projects. There should be reasonable defaults configured that are easily customizable for each project    

 ## Decision

-use existing HelmScore and pass the scores for grafana and prometheus for each individual project
+Create MonitoringStack score that creates a maestro to launch the monitoring stack or not if it is already present. 
+The MonitoringStack score can be passed to the maestro in the vec! scores list

 ## Rationale

-This will allow the end user to choose to use the monitoring and alerting stack if they choose for both local as well as dev/prod projects. Grafana and Prometheus are installed via helm which is consitent with OKD, helm and other design choices. Allows the use of already defined Scores.
+Having the score launch a maestro will allow the user to easily create a new monitoring stack and keeps composants grouped together. The MonitoringScore can handle all the logic for adding alerts, ensuring that the stack is running etc. 

 ## Alerternatives considered

@@ -30,6 +31,8 @@ This will allow the end user to choose to use the monitoring and alerting stack
        - No default solution implemented
        - Dev needs to chose what they use
        - Increases complexity of score projects
+        - Each project will create a new monitoring and alerting instance rather than joining the existing one
+

 - ### Use OKD grafana and prometheus
    - **Pros**:
@@ -39,8 +42,27 @@ This will allow the end user to choose to use the monitoring and alerting stack

 - ### Create a monitoring and alerting crate similar to harmony tui
    - **Pros**:
-        - Creates a default solution that can be implemented or not depending on user choice
+        - Creates a default solution that can be implemented once by harmony
+        - can create a join function that will allow a project to connect to the existing solution
+        - eliminates risk of creating multiple instances of grafana or prometheus
    - **Cons**:
        - more complex than using a helm score
+        - management of values files for individual functions becomes more complicated, ie how do you create alerts for one project via helm install that doesnt overwrite the other alerts 

+- ### Add monitoring to Maestro struct so whether the monitoring stack is used must be defined 
+    - **Pros**:
+        - less for the user to define
+        - may be easier to set defaults 
+    - **Cons**:
+        - feels counterintuitive 
+        - would need to modify the structure of the maestro and how it operates which seems like a bad idea
+        - unclear how to allow user to pass custom values/configs to the monitoring stack for subsequent projects

+- ### Create MonitoringStack score to add to scores vec! which loads a maestro to install stack if not ready or add custom endpoints/alerts to existing stack
+    - **Pros**:
+        - Maestro already accepts a list of scores to initialize
+        - leaving out the monitoring score simply means the user does not want monitoring
+        - if the monitoring stack is already created, the MonitoringStack score doesn't necessarily need to be added to each project
+        - composants of the monitoring stack are bundled together and can be expaned or modified from the same place
+    - **Cons**:
+        - maybe need to create