> ## Documentation Index
> Fetch the complete documentation index at: https://docs.budecosystem.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction to Clusters

> Register, operate, and govern clusters for Bud workloads

## Overview

The **Clusters** module is Bud's infrastructure control plane for model-serving and evaluation workloads. It gives platform teams a single place to onboard clusters, track health, manage node capacity, and apply runtime defaults.

Whether you run GPU-heavy production inference or mixed CPU/GPU environments, Clusters helps you keep operations reliable and auditable.

<img src="https://mintcdn.com/budecosystem-b7b14df4/VWVW0RGNFnJu1JHC/images/image-29.png?fit=max&auto=format&n=VWVW0RGNFnJu1JHC&q=85&s=525feb7bbe120add2cda589331b923ee" alt="Image" width="1920" height="877" data-path="images/image-29.png" />

## Why Clusters Matter

**Unified infrastructure visibility** Track capacity, status, and deployments across all connected clusters.

**Safe cluster lifecycle operations** Add, edit, and remove clusters with guardrails for active deployments.

**Hardware-aware planning** Understand CPU/GPU/HPU/TPU availability, worker utilization, and scaling readiness.

**Operational defaults at cluster level** Configure storage classes and access modes once and reuse across deployments.

## Cluster Lifecycle in Bud

```mermaid theme={null}
flowchart LR
    A[Register or Connect Cluster] --> B[Validate Health and Inventory]
    B --> C[Run Deployments]
    C --> D[Monitor Nodes and Metrics]
    D --> E[Tune Settings and Storage Defaults]
    E --> F[Scale, Maintain, or Decommission]
```

## Core Areas in the Clusters Module

| Area                | What you can do                                            |
| ------------------- | ---------------------------------------------------------- |
| **Cluster List**    | View all clusters, hardware profile, endpoints, and status |
| **General Tab**     | Review node and resource summaries with utilization trends |
| **Deployments Tab** | Inspect deployments running on the selected cluster        |
| **Nodes Tab**       | Analyze per-node status, capacity, and events              |
| **Analytics Tab**   | Review cluster-level metrics and usage views               |
| **Settings Tab**    | Configure default storage class and access mode            |

## Who Uses This Module

* **Platform / Infra teams** managing capacity and reliability.
* **MLOps teams** validating where models should run.
* **Security and governance leads** auditing cluster operations and permissions.

## Getting Started

<CardGroup cols={3}>
  <Card icon="play" href="/clusters/quickstart" title="Quick Start">
    Register your first cluster and verify readiness
  </Card>

  <Card icon="book" href="/clusters/cluster-concepts" title="Cluster Concepts">
    Learn module structure, tabs, and lifecycle concepts
  </Card>

  <Card icon="graduation-cap" href="/clusters/creating-first-cluster" title="Step-by-Step Tutorial">
    Walk through creation, validation, and operations
  </Card>
</CardGroup>
