Skip to main content

What is a Cluster in Bud?

A cluster is a registered Kubernetes environment where Bud can schedule and run inference workloads, evaluations, and supporting services. Clusters expose compute inventory (CPU/GPU/HPU/TPU), endpoint capacity, and runtime settings for deployments.

Cluster Lifecycle

Cluster Types

  • Cloud-managed clusters from supported providers.
  • Self-managed / existing Kubernetes clusters connected through configuration and ingress details.
  • Mixed hardware clusters with CPU and accelerator resources.

Detail Tabs and Their Purpose

TabPurpose
GeneralView resource summaries, node counts, and high-level utilization
DeploymentsTrack endpoints and model workloads running on this cluster
NodesInspect node-level status, allocatable resources, and event history
AnalyticsAnalyze broader utilization and operational KPIs
SettingsDefine default storage class and access mode for deployments

Governance and Permissions

Cluster actions align with role-based access control:
  • cluster:view for read-only operations.
  • cluster:manage for add/edit/delete and settings changes.
Every critical action (registration, updates, deletion) should be traceable through audit and activity trails.

Capacity and Reliability Principles

Prioritize healthy clusters before scheduling new workloads.
Use node-level events to diagnose instability quickly.
Apply storage defaults to reduce deployment-time misconfiguration.
Avoid deleting clusters with active endpoints unless migration is complete.

Conceptual Data Flow

Next Steps

Quick Start

Register a cluster and verify readiness

Cluster Tabs Reference

Review each cluster tab and key actions