> ## Documentation Index
> Fetch the complete documentation index at: https://docs.budecosystem.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Running Evaluations

> Operational guide for launching, monitoring, and rerunning evaluations

## Run Workflow

```mermaid theme={null}
flowchart LR
    A[Open Experiment] --> B[Run Evaluation]
    B --> C[Select Model]
    C --> D[Select Traits]
    D --> E[Select Datasets]
    E --> F[Submit]
    F --> G[Monitor Status]
    G --> H[Inspect Scores]
```

## Before You Run

* Confirm model target is valid for your environment.
* Confirm selected traits align with the decision you need to make.
* Use focused dataset scope for faster feedback loops.

## During Execution

* Watch status and timestamps in experiment detail.
* Track total evaluations and cumulative duration.
* Note failed runs and rerun after correcting configuration.

## After Execution

1. Check benchmark summary for aggregate score and duration.
2. Review current metrics for trait-level performance.
3. Open dataset detail page for leaderboard and explorer evidence.
4. Export results if review or compliance requires an artifact.

## Rerun Strategies

| Strategy              | When to Use          | Benefit                      |
| --------------------- | -------------------- | ---------------------------- |
| **Same config rerun** | Validate consistency | Detect noisy results         |
| **Model swap**        | Compare candidates   | Faster selection             |
| **Trait subset run**  | Isolate regressions  | Focused debugging            |
| **Dataset expansion** | Increase confidence  | Better generalization signal |

## Best Practices

<Check>
  Keep experiment names outcome-focused (for example: "May release quality gate").
</Check>

<Check>
  Use tags to separate baseline, canary, and production candidates.
</Check>

<Check>
  Store notes on why a rerun was triggered.
</Check>
