> ## Documentation Index
> Fetch the complete documentation index at: https://docs.budecosystem.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction to Evaluations

> Benchmark model quality with curated datasets, traits, and experiment workflows

## Overview

Bud AI Foundry Evaluations help teams measure model quality before rollout. You can benchmark Bud-hosted and cloud-connected models using curated datasets, trait filters, and repeatable experiment runs.

```mermaid theme={null}
flowchart LR
    A[Evaluations Hub] --> B[Review Dataset + Traits]
    B --> C[Create Experiment]
    C --> D[Run Evaluation]
    D --> E[Review Scores + Explorer]
    E --> F[Promote or Iterate]
```

## What You Can Do

**Discover Benchmarks** Use Evaluations Hub search and trait filters to find relevant datasets quickly.

**Run Structured Experiments** Group one or more runs under an experiment, then compare model results over time.

**Inspect Results Deeply** Use Details, Leaderboard and Evaluations Explorer tabs to move from summary scores to row-level evidence.

**Export for Governance** Export evaluation outcomes as CSV for audits, reviews, and reporting.

## Core Screens

| Screen                | Purpose                                                     |
| --------------------- | ----------------------------------------------------------- |
| **Evaluations Hub**   | Browse datasets with modalities, traits, and metadata links |
| **Evaluation Detail** | Inspect details, leaderboard and sample-level explorer      |
| **Experiments**       | Track evaluation studies with status, tags, and timestamps  |
| **Experiment Detail** | Review run history, benchmark details, and current metrics  |

<img src="https://mintcdn.com/budecosystem-b7b14df4/FOwUHwUgDoHMCLof/images/image-52.png?fit=max&auto=format&n=FOwUHwUgDoHMCLof&q=85&s=af4f2cfeecd6c83edd6a78b8c2ffef59" alt="Image" width="1920" height="877" data-path="images/image-52.png" />

## Evaluation Lifecycle

1. Discover a dataset in **Evaluations Hub**.
2. Create or open an **Experiment**.
3. Launch a **Run Evaluation** workflow.
4. Monitor run status and per-trait scores.
5. Compare outcomes and rerun with updated configuration.

## Next Steps

<CardGroup cols={3}>
  <Card title="Quick Start" icon="play" href="/evaluations/quickstart">
    Run your first evaluation in minutes
  </Card>

  <Card title="Evaluation Concepts" icon="book" href="/evaluations/evaluation-concepts">
    Learn entities, states, and score interpretation
  </Card>

  <Card title="First Evaluation" icon="graduation-cap" href="/evaluations/creating-first-evaluation">
    Follow a full end-to-end walkthrough
  </Card>
</CardGroup>
