harmony/docs/guides/writing-a-score.md

# Writing a Score

A `Score` declares _what_ you want to achieve. It is decoupled from _how_ it is achieved — that logic lives in an `Interpret`.

## The Pattern

A Score consists of two parts:

1. **A struct** — holds the configuration for your desired state
2. **A `Score<T>` implementation** — returns an `Interpret` that knows how to execute

An `Interpret` contains the actual execution logic and connects your Score to the capabilities exposed by a `Topology`.

## Example: A Simple Score

Here's a simplified version of `NtfyScore` from the `ntfy` module:

```rust
use async_trait::async_trait;
use harmony::{
    interpret::{Interpret, InterpretError, Outcome},
    inventory::Inventory,
    score::Score,
    topology::{HelmCommand, K8sclient, Topology},
};

/// MyScore declares "I want to install the ntfy server"
#[derive(Debug, Clone)]
pub struct MyScore {
    pub namespace: String,
    pub host: String,
}

impl<T: Topology + HelmCommand + K8sclient> Score<T> for MyScore {
    fn create_interpret(&self) -> Box<dyn Interpret<T>> {
        Box::new(MyInterpret { score: self.clone() })
    }

    fn name(&self) -> String {
        "ntfy [MyScore]".into()
    }
}

/// MyInterpret knows _how_ to install ntfy using the Topology's capabilities
#[derive(Debug)]
pub struct MyInterpret {
    pub score: MyScore,
}

#[async_trait]
impl<T: Topology + HelmCommand + K8sclient> Interpret<T> for MyInterpret {
    async fn execute(
        &self,
        inventory: &Inventory,
        topology: &T,
    ) -> Result<Outcome, InterpretError> {
        // 1. Get a Kubernetes client from the Topology
        let client = topology.k8s_client().await?;

        // 2. Use Helm to install the ntfy chart
        // (via topology's HelmCommand capability)

        // 3. Wait for the deployment to be ready
        client
            .wait_until_deployment_ready("ntfy", Some(&self.score.namespace), None)
            .await?;

        Ok(Outcome::success("ntfy installed".to_string()))
    }
}
```

## The Compile-Time Safety Check

The generic `Score<T>` trait is bounded by `T: Topology`. This means the compiler enforces that your Score only runs on Topologies that expose the capabilities your Interpret needs:

```rust
// This only compiles if K8sAnywhereTopology (or any T)
// implements HelmCommand and K8sclient
impl<T: Topology + HelmCommand + K8sclient> Score<T> for MyScore { ... }
```

If you try to run this Score against a Topology that doesn't expose `HelmCommand`, you get a compile error — before any code runs.

## Using Your Score

Once defined, your Score integrates with the Harmony CLI:

```rust
use harmony::{
    inventory::Inventory,
    topology::K8sAnywhereTopology,
};

#[tokio::main]
async fn main() {
    let my_score = MyScore {
        namespace: "monitoring".to_string(),
        host: "ntfy.example.com".to_string(),
    };

    harmony_cli::run(
        Inventory::autoload(),
        K8sAnywhereTopology::from_env(),
        vec![Box::new(my_score)],
        None,
    )
    .await
    .unwrap();
}
```

## Key Patterns

### Composing Scores

Scores can include other Scores via features:

```rust
let app = ApplicationScore {
    features: vec![
        Box::new(PackagingDeployment { application: app.clone() }),
        Box::new(Monitoring { application: app.clone(), alert_receiver: vec![] }),
    ],
    application: app,
};
```

### Reusing Interpret Logic

Many Scores delegate to shared `Interpret` implementations. For example, `HelmChartScore` provides a reusable Interpret for any Helm-based deployment. Your Score can wrap it:

```rust
impl<T: Topology + HelmCommand> Score<T> for MyScore {
    fn create_interpret(&self) -> Box<dyn Interpret<T>> {
        Box::new(HelmChartInterpret { /* your config */ })
    }
}
```

### Accessing Topology Capabilities

Your Interpret accesses infrastructure through Capabilities exposed by the Topology:

```rust
// Via the Topology trait directly
let k8s_client = topology.k8s_client().await?;
let helm = topology.get_helm_command();

// Or via Capability traits
impl<T: Topology + K8sclient> Interpret<T> for MyInterpret {
    async fn execute(...) {
        let client = topology.k8s_client().await?;
        // use client...
    }
}
```

## Design Principles

### Capabilities are industry concepts, not tools

A capability trait must represent a **standard infrastructure need** that could be fulfilled by multiple tools. The developer who writes a Score should not need to know which product provides the capability.

Good capabilities: `DnsServer`, `LoadBalancer`, `DhcpServer`, `CertificateManagement`, `Router`
These are industry-standard concepts. OPNsense provides `DnsServer` via Unbound; a future topology could provide it via CoreDNS or AWS Route53. The Score doesn't care.

The one exception is when the developer fundamentally needs to know the implementation: `PostgreSQL` is a capability (not `Database`) because the developer writes PostgreSQL-specific SQL, replication configs, and connection strings. Swapping it for MariaDB would break the application, not just the infrastructure.

**Test:** If you could swap the underlying tool without breaking any Score that uses the capability, you've drawn the boundary correctly. If swapping would require rewriting Scores, the capability is too tool-specific.

### One Score per concern, one capability per concern

A Score should express a single infrastructure intent. A capability should expose a single infrastructure concept.

If you're building a deployment that combines multiple concerns (e.g., "deploy Zitadel" requires PostgreSQL + Helm + K8s + Ingress), the Score **declares all of them as trait bounds** and the Topology provides them:

```rust
impl<T: Topology + K8sclient + HelmCommand + PostgreSQL> Score<T> for ZitadelScore
```

If you're building a tool that provides multiple capabilities (e.g., OpenBao provides secret storage, KV versioning, JWT auth, policy management), each capability should be a **separate trait** that can be implemented independently. This way, a Score that only needs secret storage doesn't pull in JWT auth machinery.

### Scores encapsulate operational complexity

The value of a Score is turning tribal knowledge into compiled, type-checked infrastructure. The `ZitadelScore` knows that you need to create a namespace, deploy a PostgreSQL cluster via CNPG, wait for the cluster to be ready, create a masterkey secret, generate a secure admin password, detect the K8s distribution, build distribution-specific Helm values, and deploy the chart. A developer using it writes:

```rust
let zitadel = ZitadelScore { host: "sso.example.com".to_string(), ..Default::default() };
```

Move procedural complexity into opinionated Scores. This makes them easy to test against various topologies (k3d, OpenShift, kubeadm, bare metal) and easy to compose in high-level examples.

### Scores must be idempotent

Running a Score twice should produce the same result as running it once. Use create-or-update semantics, check for existing state before acting, and handle "already exists" responses gracefully.

### Scores must not depend on other Scores running first

A Score declares its capability requirements via trait bounds. It does **not** assume that another Score has run before it. If your Score needs PostgreSQL, it declares `T: PostgreSQL` and lets the Topology handle whether PostgreSQL needs to be installed first.

If you find yourself writing "run Score A, then run Score B", consider whether Score B should declare the capability that Score A provides, or whether both should be orchestrated by a higher-level Score that composes them.

## Best Practices

- **Keep Scores focused** — one Score per concern (deployment, monitoring, networking)
- **Use `..Default::default()`** for optional fields so callers only need to specify what they care about
- **Return `Outcome`** — use `Outcome::success`, `Outcome::failure`, or `Outcome::success_with_details` to communicate results clearly
- **Handle errors gracefully** — return meaningful `InterpretError` messages that help operators debug issues
- **Design capabilities around the developer's need** — not around the tool that fulfills it. Ask: "what is the core need that leads a developer to use this tool?"
- **Don't name capabilities after tools** — `SecretVault` not `OpenbaoStore`, `IdentityProvider` not `ZitadelAuth`