Overview
Use this guide to diagnose common problems in cluster onboarding, health monitoring, and settings management.Troubleshooting Decision Tree
Cluster does not appear after onboarding
Cluster does not appear after onboarding
Possible causes
- Incomplete onboarding form data.
- Invalid kube configuration or provider credentials.
- Backend workflow failed during registration.
- Re-run onboarding with validated configuration inputs.
- Confirm ingress and API connectivity.
- Check platform logs for onboarding workflow errors.
Cluster cannot be deleted
Cluster cannot be deleted
Possible causes
- Active deployments still attached to the cluster.
- Insufficient permissions to perform delete.
- Review Deployments tab and drain or migrate active endpoints.
- Confirm
cluster:managepermission. - Retry deletion after dependencies are cleared.
General tab shows degraded or missing metrics
General tab shows degraded or missing metrics
Possible causes
- Metrics pipeline latency or outage.
- Node exporter/connectivity issues.
- Compare with Nodes tab readiness and event data.
- Validate monitoring integration health.
- Check if issue is cluster-local or platform-wide.
Node events show repeated scheduling failures
Node events show repeated scheduling failures
Possible causes
- Insufficient allocatable CPU/GPU/memory.
- Taints/affinity mismatch.
- Storage constraints.
- Inspect request-vs-allocatable values on affected nodes.
- Validate scheduling constraints in deployment configs.
- Scale capacity or rebalance workloads.
Unable to save storage settings
Unable to save storage settings
Possible causes
- Storage classes unavailable from cluster API.
- Access mode incompatible with selected storage class.
- API or permission errors.
- Reload settings and confirm storage class discovery works.
- Select recommended access mode.
- Verify user has manage permission and retry.
Escalation Data to Capture
- Cluster ID and environment.
- Timestamp and user action attempted.
- Screenshot or export of relevant tab state.
- Node event snippets and affected workloads.
Next Steps
Cluster Concepts
Revisit lifecycle and tab responsibilities
Cluster Operations Guide
Strengthen day-2 operational practices