69 lines
		
	
	
		
			3.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			69 lines
		
	
	
		
			3.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Architecture Decision Record: Monitoring and Alerting
 | |
| 
 | |
| Proposed by: Willem Rolleman
 | |
| Date: April 28 2025
 | |
| 
 | |
| ## Status
 | |
| 
 | |
| Proposed
 | |
| 
 | |
| ## Context
 | |
| 
 | |
| A harmony user should be able to initialize a monitoring stack easily, either at the first run of Harmony, or that integrates with existing proects and infra without creating multiple instances of the monitoring stack or overwriting existing alerts/configurations.The user also needs a simple way to configure the stack so that it watches the projects. There should be reasonable defaults configured that are easily customizable for each project    
 | |
| 
 | |
| ## Decision
 | |
| 
 | |
| Create MonitoringStack score that creates a maestro to launch the monitoring stack or not if it is already present. 
 | |
| The MonitoringStack score can be passed to the maestro in the vec! scores list
 | |
| 
 | |
| ## Rationale
 | |
| 
 | |
| Having the score launch a maestro will allow the user to easily create a new monitoring stack and keeps composants grouped together. The MonitoringScore can handle all the logic for adding alerts, ensuring that the stack is running etc. 
 | |
| 
 | |
| ## Alerternatives considered
 | |
| 
 | |
| - ### Implement alerting and monitoring stack using existing HelmScore for each project
 | |
|     - **Pros**:
 | |
|         - Each project can choose to use the monitoring and alerting stack that they choose
 | |
|         - Less overhead in terms of care harmony code
 | |
|         - can add Box::new(grafana::grafanascore(namespace)) 
 | |
|     - **Cons**:
 | |
|         - No default solution implemented
 | |
|         - Dev needs to chose what they use
 | |
|         - Increases complexity of score projects
 | |
|         - Each project will create a new monitoring and alerting instance rather than joining the existing one
 | |
| 
 | |
| 
 | |
| - ### Use OKD grafana and prometheus
 | |
|     - **Pros**:
 | |
|         - Minimal config to do in Harmony
 | |
|     - **Cons**:
 | |
|         - relies on OKD so will not working for local testing via k3d
 | |
| 
 | |
| - ### Create a monitoring and alerting crate similar to harmony tui
 | |
|     - **Pros**:
 | |
|         - Creates a default solution that can be implemented once by harmony
 | |
|         - can create a join function that will allow a project to connect to the existing solution
 | |
|         - eliminates risk of creating multiple instances of grafana or prometheus
 | |
|     - **Cons**:
 | |
|         - more complex than using a helm score
 | |
|         - management of values files for individual functions becomes more complicated, ie how do you create alerts for one project via helm install that doesnt overwrite the other alerts 
 | |
| 
 | |
| - ### Add monitoring to Maestro struct so whether the monitoring stack is used must be defined 
 | |
|     - **Pros**:
 | |
|         - less for the user to define
 | |
|         - may be easier to set defaults 
 | |
|     - **Cons**:
 | |
|         - feels counterintuitive 
 | |
|         - would need to modify the structure of the maestro and how it operates which seems like a bad idea
 | |
|         - unclear how to allow user to pass custom values/configs to the monitoring stack for subsequent projects
 | |
| 
 | |
| - ### Create MonitoringStack score to add to scores vec! which loads a maestro to install stack if not ready or add custom endpoints/alerts to existing stack
 | |
|     - **Pros**:
 | |
|         - Maestro already accepts a list of scores to initialize
 | |
|         - leaving out the monitoring score simply means the user does not want monitoring
 | |
|         - if the monitoring stack is already created, the MonitoringStack score doesn't necessarily need to be added to each project
 | |
|         - composants of the monitoring stack are bundled together and can be expaned or modified from the same place
 | |
|     - **Cons**:
 | |
|         - maybe need to create  
 |