Skip to main content

Documentation Index

Fetch the complete documentation index at: https://budecosystem-b7b14df4.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Use this guide to quickly diagnose issues while running evaluation experiments.

Quick Triage Flow

Dataset Discovery Issues

No datasets shown in Evaluations Hub

Possible causes
  • Search query is too restrictive.
  • Trait filters exclude all results.
Fixes
  1. Clear search input.
  2. Remove all trait filters.
  3. Reapply filters one by one.

Run Launch Issues

Run Evaluation button does not complete a run

Possible causes
  • Required model or dataset selection is missing.
  • Selected configuration is invalid for the chosen scope.
Fixes
  1. Reopen run form and verify all selections.
  2. Start with one trait and one dataset.
  3. Retry with a known-good model target.

Result Interpretation Issues

Leaderboard has no useful comparison

Possible causes
  • Too few completed runs.
  • Models were evaluated on different scopes.
Fixes
  1. Rerun candidates on the same traits/datasets.
  2. Keep all comparisons in one experiment.

Explorer data appears inconsistent with score

Possible causes
  • Sampling differences across runs.
  • Score is aggregate while Explorer is row-level.
Fixes
  1. Review multiple rows, not a single sample.
  2. Rerun to confirm consistency.

Experiment Management Issues

Hard to locate the right experiment

Fixes
  • Use standardized tags and naming.
  • Sort by creation date and filter by status/model.

Too many failed runs

Fixes
  • Reduce scope (fewer traits/datasets) to isolate failure.
  • Rerun incrementally after each configuration change.

Escalation Checklist

Before escalating internally, collect:
  • Experiment name and run timestamp.
  • Model, traits, and datasets selected.
  • Observed status and screenshots of key tabs.
  • Whether issue reproduces after rerun.