- onboard models from multiple sources,
- apply governance and verification checks,
- compare performance and evaluation outcomes, and
- route workloads to the best fit for cost, latency, and compliance.
Why the Models module matters
As teams scale GenAI usage, model operations become fragmented across providers, file stores, and environments. The Models module gives you a single operating model for:- Catalog consistency: one inventory for cloud, Hugging Face, URL, and disk-based models.
- Operational confidence: scan and verification states tracked per model.
- Deployment readiness: model metadata aligned with modality and endpoint capabilities.
- Decision support: benchmark and evaluation views to choose the right model for production.
Core capabilities
Unified catalog
Keep cloud and local models in one searchable repository with metadata, tags, and ownership context.
Source flexibility
Add models from cloud providers, Hugging Face, signed URLs, or mounted disk paths.
Security and trust
Track scan and verification signals before exposing models for downstream use.
Performance visibility
Use benchmark history and evaluation outputs to compare throughput, latency, and quality trade-offs.
High-level lifecycle
Who uses this module
- Platform teams to standardize onboarding, safety checks, and governance.
- ML engineers to compare variants and move approved versions toward production.
- Application teams to pick ready-to-use models with clear capabilities and constraints.