🔒 Permission Denied — Role Viewer has limited access. Some actions are disabled.
Inference Services
Deploy and manage model inference endpoints.
🌐
6,945
Total
✅
3,600
Healthy
⚠️
3,040
Degraded
❌
2,779
Down
| Name | Status | Model | Replicas | RPS | Latency P99 | GPU Type | Owner | |
|---|---|---|---|---|---|---|---|---|
| llm-prod-13 | Running | stable-diffusion-xl | 2/4 | 1,858 | 286ms | A100-80G | ||
| llm-canary-5 | Running | mistral-7b | 4/7 | 4,492 | 1689ms | A100-80G | ||
| llm-prod-16 | Error | mistral-7b | 3/8 | 1,598 | 619ms | A10-24G | ||
| multimodal-prod-11 | Deploying | llama-3.1-70b | 3/6 | 374 | 1589ms | A100-80G | ||
| speech-canary-11 | Error | stable-diffusion-xl | 1/8 | 1,700 | 2004ms | A100-80G | ||
| embed-staging-15 | Deploying | whisper-large | 4/4 | 3,886 | 1182ms | A10-24G | ||
| vision-canary-15 | Scaling | whisper-large | 4/7 | 3,797 | 504ms | A10-24G | ||
| multimodal-prod-18 | Running | whisper-large | 3/3 | 2,393 | 2338ms | A100-80G | ||
| llm-staging-16 | Stopped | whisper-large | 2/5 | 3,403 | 1684ms | A10-24G | ||
| llm-canary-10 | Stopped | stable-diffusion-xl | 3/8 | 1,046 | 1445ms | H100-80G |