> ## Documentation Index
> Fetch the complete documentation index at: https://docs.budecosystem.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Evaluations Reference

> Reference for evaluations objects, views, and common fields

## Object Relationships

```mermaid theme={null}
flowchart TB
    T[Trait] --> D[Dataset]
    D --> X[Evaluation Detail]
    X --> L[Leaderboard]
    X --> E[Explorer]
    D --> R[Run]
    R --> EX[Experiment]
```

## Key Views

### Evaluations Hub

| Element        | Description                                          |
| -------------- | ---------------------------------------------------- |
| Search bar     | Filters datasets by name/query text                  |
| Trait filters  | Multi-select capability/domain filters               |
| Dataset cards  | Show modalities, traits, description, metadata links |
| Result counter | Shows filtered count vs total                        |

### Evaluation Detail

| Tab                  | What It Shows                                     |
| -------------------- | ------------------------------------------------- |
| Details              | Dataset description, context, and expectations    |
| Leaderboard          | Ranked model results for the dataset              |
| Evaluations Explorer | Sample-level prompt, response, and metric details |

### Experiments

| Field        | Meaning                                |
| ------------ | -------------------------------------- |
| Name         | Experiment identifier                  |
| Models       | Model(s) included in runs              |
| Traits       | Traits selected in runs                |
| Status       | Current state of experiment activities |
| Tags         | Operator-defined grouping labels       |
| Created date | Experiment creation timestamp          |

## Common Run States

* `Running`
* `Completed`
* `Failed`

## Common Metadata and Signals

* Modalities (text, image, video, actions, embeddings)
* Trait associations
* Estimated input/output tokens
* Run timing and duration
* Trait-level benchmark score
