feat: Initial setup for monitoring and alerting #48

Merged
wjro merged 7 commits from feat/monitor into master 2025-06-03 18:17:17 +00:00
Owner
No description provided.
wjro added 1 commit 2025-05-30 17:26:54 +00:00
feat: Initial setup for monitoring and alerting
Some checks failed
Run Check Script / check (push) Failing after 43s
Run Check Script / check (pull_request) Failing after 45s
b631e8ccbb
wjro added 1 commit 2025-05-30 17:59:46 +00:00
fix:cargo fmt
All checks were successful
Run Check Script / check (push) Successful in 1m45s
Run Check Script / check (pull_request) Successful in 1m45s
7e3f1b1830
johnride requested changes 2025-06-02 12:54:30 +00:00
@@ -265,1 +266,4 @@
}
#[async_trait]
impl NotificationAdapterDeployer for K8sAnywhereTopology {
Owner

I feel like this is not a correct abstraction. NotificationAdapter has a responsibility and Deployer has another. This is a strong smell of broken Single Responsibility principle, probably interface segregation too.

I feel like this is not a correct abstraction. NotificationAdapter has a responsibility and Deployer has another. This is a strong smell of broken Single Responsibility principle, probably interface segregation too.
@@ -0,0 +3,4 @@
pub mod monitoring;
pub mod notification_adapter_deployer;
pub enum MonitoringBackendType {
Owner

Violates OCP : we want to allow external contributors to provide new backend implementations (that is, if we want to introduce the concept of monitoring backend right now or wait until the needs arises?)

Violates OCP : we want to allow external contributors to provide new backend implementations (that is, if we want to introduce the concept of monitoring backend right now or wait until the needs arises?)
@@ -0,0 +31,4 @@
monitor_config: &MonitorConfig,
) -> Result<Outcome, InterpretError>;
async fn ensure_alert_channel_dependencies(
Owner

SRP problem

SRP problem
@@ -0,0 +50,4 @@
}
}
pub trait AlertChannelConfig: Debug + DynClone + Send + Sync {
Owner

I think we can solve the dependency issue by introducing a T here :

pub trait AlertChannelConfig<T> {
 // only alertChannel related stuff, be careful about SRP here, no dependency management at this level other than the T itself
}

impl <T: DiscordWebhookSender> AlertChannelConfig<T> for DiscordWebhookConfig { // The dependency on the DiscordWebHookSender is declared here
 // implement functions
}

pub trait Monitor<T> {
 config: Vec<Box<dyn AlertChannelConfig<T>>> // The T from alertChannelConfig here will force the Monitor to provide a T that implements DiscordWebhookSender
}

impl DiscordWebhookSender for K8sAnywhereTopology { // Satisfy here the trait bound on the DiscordWehbookConfig 
 // implement functions that make sure DiscordAlertManager is installed, use the same pattern as for k8sClient that autoinstalls k3d if no k8s cluster is available
}
I think we can solve the dependency issue by introducing a T here : ```rust pub trait AlertChannelConfig<T> { // only alertChannel related stuff, be careful about SRP here, no dependency management at this level other than the T itself } impl <T: DiscordWebhookSender> AlertChannelConfig<T> for DiscordWebhookConfig { // The dependency on the DiscordWebHookSender is declared here // implement functions } pub trait Monitor<T> { config: Vec<Box<dyn AlertChannelConfig<T>>> // The T from alertChannelConfig here will force the Monitor to provide a T that implements DiscordWebhookSender } impl DiscordWebhookSender for K8sAnywhereTopology { // Satisfy here the trait bound on the DiscordWehbookConfig // implement functions that make sure DiscordAlertManager is installed, use the same pattern as for k8sClient that autoinstalls k3d if no k8s cluster is available } ```
wjro added 1 commit 2025-06-02 15:42:55 +00:00
wip: modified initial monitoring architecture based on pr review
Some checks failed
Run Check Script / check (push) Failing after 46s
Run Check Script / check (pull_request) Failing after 43s
691540fe64
wjro added 1 commit 2025-06-02 15:47:38 +00:00
fix: modified files in mod
All checks were successful
Run Check Script / check (push) Successful in 1m48s
Run Check Script / check (pull_request) Successful in 1m46s
56dc1e93c1
johnride requested changes 2025-06-02 17:30:26 +00:00
@@ -0,0 +9,4 @@
use crate::{interpret::Outcome, topology::Topology};
#[async_trait]
pub trait Monitor<T: Topology>: Debug + Send + Sync {
Owner

A bit of rustdoc would be relevant here to explain the intended use of this trait.

A bit of rustdoc would be relevant here to explain the intended use of this trait.
johnride marked this conversation as resolved
@@ -0,0 +24,4 @@
}
#[async_trait]
pub trait AlertChannelConfig<T>: Debug + DynClone + Send + Sync {
Owner

I think this trait is correct but the way it is implemented feels wrong. I understand that you currently only have the discord implementation and you interact directly with it. But I feel like there will be a use case where we will want to have simple getter and setters for common attributes of AlertChannelConfig. I don't see a 100% clear use case right now though.

Food for thought.

Which makes me think : is this trait required at all? Probably not. Each AlertChannel implementation will have its own config and either the Monitor or AlertChannel trait will provide standardization for identification and compilation safety. But it is probably not necessary to try to expose a config specific trait.

I think this trait is correct but the way it is implemented feels wrong. I understand that you currently only have the discord implementation and you interact directly with it. But I feel like there will be a use case where we will want to have simple getter and setters for common attributes of AlertChannelConfig. I don't see a 100% clear use case right now though. Food for thought. Which makes me think : is this trait required at all? Probably not. Each AlertChannel implementation will have its own config and either the Monitor or AlertChannel trait will provide standardization for identification and compilation safety. But it is probably not necessary to try to expose a **config** specific trait.
johnride marked this conversation as resolved
@@ -0,0 +27,4 @@
pub trait AlertChannelConfig<T>: Debug + DynClone + Send + Sync {
fn channel_identifier(&self) -> String;
fn webhook_url(&self) -> Option<Url>;
Owner

I don't think webhook url deserves to be in this abstract trait. It is possible that the AlertChannel does not have a webhook, such as SMS, phone calls, SMTP.

I don't think webhook url deserves to be in this abstract trait. It is possible that the AlertChannel does not have a webhook, such as SMS, phone calls, SMTP.
johnride marked this conversation as resolved
@@ -0,0 +26,4 @@
}
#[async_trait]
impl<T: Topology + DiscordWebhookSender> AlertChannelConfig<T> for DiscordWebhookConfig {
Owner

You don't have to specify Topology here. I think it's better to keep the trait bindings to the minimum. DiscordWebhookSender should be enough.

You don't have to specify Topology here. I think it's better to keep the trait bindings to the minimum. DiscordWebhookSender should be enough.
johnride marked this conversation as resolved
@@ -0,0 +40,4 @@
}
#[async_trait]
impl DiscordWebhookSender for K8sAnywhereTopology {
Owner

Nice 👍

We probably will have to add a mechanism for Topologies so that dependencies can register their installation mechanisms themselves to make sure they are initialized properly when maestro initializes the topology. But that should not be too hard and is out of scope of this p-r.

Nice 👍 We probably will have to add a mechanism for Topologies so that dependencies can register their installation mechanisms themselves to make sure they are initialized properly when maestro initializes the topology. But that should not be too hard and is out of scope of this p-r.
johnride marked this conversation as resolved
wjro added 1 commit 2025-06-02 18:45:01 +00:00
wip: applied comments in pr, changed naming of AlertChannel to AlertReceiver and added rust doc to Monitor for clarity
All checks were successful
Run Check Script / check (push) Successful in 1m49s
Run Check Script / check (pull_request) Successful in 1m47s
0d56fbc09d
wjro added 1 commit 2025-06-02 20:11:52 +00:00
wip: removed AlertReceiverConfig
Some checks failed
Run Check Script / check (push) Failing after 44s
Run Check Script / check (pull_request) Failing after 44s
a2be9457b9
wjro added 1 commit 2025-06-02 20:21:02 +00:00
fix: cargo fmt
All checks were successful
Run Check Script / check (push) Successful in 1m47s
Run Check Script / check (pull_request) Successful in 1m47s
12eb4ae31f
johnride approved these changes 2025-06-02 20:59:15 +00:00
@@ -0,0 +10,4 @@
/// Represents an entity responsible for collecting and organizing observability data
/// from various telemetry sources
/// A `Monitor` abstracts the logic required to scrape, aggregate, and structure
/// monitoring data, enabling consistent processing regardless of the underlying data source.
Owner

This rustdoc is pretty good to explain the abstraction, but giving a concrete example such as Prometheus would make a complete understanding much easier. Even I still have a bit of trouble to figure it out as-is.

This rustdoc is pretty good to explain the abstraction, but giving a concrete example such as Prometheus would make a complete understanding much easier. Even I still have a bit of trouble to figure it out as-is.
wjro merged commit 31e59937dc into master 2025-06-03 18:17:17 +00:00
wjro deleted branch feat/monitor 2025-06-03 18:17:18 +00:00
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: NationTech/harmony#48
No description provided.