> ## Documentation Index
> Fetch the complete documentation index at: https://docs.budecosystem.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Model Comparison Workflows

> Design fair and repeatable side-by-side model evaluations

## Overview

Use this guide to compare models fairly and produce decisions that are easy to defend with product and engineering teams.

```mermaid theme={null}
flowchart TD
    A[Define Task + Rubric] --> B[Prepare Prompt Set]
    B --> C[Run Prompt Set on Model A/B/C]
    C --> D[Score Outputs]
    D --> E[Review Latency + Consistency]
    E --> F[Select Preferred Model]
```

## Best-Practice Comparison Setup

1. Use the same prompt set for all models.
2. Keep parameter settings equivalent unless intentionally testing defaults.
3. Score outputs with a fixed rubric.
4. Repeat tests at least twice for consistency checks.

## Suggested Rubric

| Criterion             | Question                                         |
| --------------------- | ------------------------------------------------ |
| Correctness           | Is the response factually and logically correct? |
| Instruction Following | Did it obey format and constraints?              |
| Tone                  | Does it match the desired style/voice?           |
| Latency               | Is response speed acceptable for UX goals?       |

## Anti-Patterns to Avoid

* Comparing different prompts across models.
* Changing multiple parameters at once.
* Choosing solely by "most verbose" output.
* Ignoring latency when UX is real-time.

## Deliverable Template

Conclude with:

* Winning model
* Why it won (rubric summary)
* Trade-offs and fallback option
