🔒 Permission Denied — Role Viewer has limited access. Some actions are disabled.
Inference Services
Deploy and manage model inference endpoints.
🌐
9,677
Total
✅
4,658
Healthy
⚠️
9,914
Degraded
❌
1,063
Down
| Name | Status | Model | Replicas | RPS | Latency P99 | GPU Type | Owner | |
|---|---|---|---|---|---|---|---|---|
| speech-prod-5 | Scaling | stable-diffusion-xl | 2/7 | 1,973 | 2090ms | A100-80G | ||
| embed-canary-15 | Stopped | whisper-large | 1/4 | 2,558 | 2187ms | A10-24G | ||
| vision-prod-5 | Running | stable-diffusion-xl | 3/6 | 3,312 | 2442ms | H100-80G | ||
| embed-staging-19 | Error | whisper-large | 2/7 | 3,258 | 2724ms | H100-80G | ||
| multimodal-canary-13 | Deploying | stable-diffusion-xl | 4/3 | 2,552 | 1378ms | A100-80G | ||
| llm-staging-8 | Deploying | whisper-large | 4/4 | 2,330 | 1125ms | A100-80G | ||
| vision-canary-11 | Stopped | stable-diffusion-xl | 4/3 | 1,614 | 2545ms | A100-80G | ||
| multimodal-prod-4 | Running | stable-diffusion-xl | 1/3 | 2,026 | 1058ms | A100-80G | ||
| multimodal-prod-19 | Scaling | whisper-large | 4/5 | 2,960 | 1396ms | H100-80G | ||
| multimodal-prod-11 | Error | llama-3.1-70b | 3/6 | 2,925 | 2959ms | H100-80G |