🔒 Permission Denied — Role Viewer has limited access. Some actions are disabled.
Inference Services
Deploy and manage model inference endpoints.
🌐
3,164
Total
✅
279
Healthy
⚠️
7,239
Degraded
❌
9,702
Down
| Name | Status | Model | Replicas | RPS | Latency P99 | GPU Type | Owner | |
|---|---|---|---|---|---|---|---|---|
| embed-prod-17 | Deploying | mistral-7b | 1/7 | 212 | 734ms | A10-24G | ||
| llm-canary-14 | Deploying | llama-3.1-70b | 1/6 | 2,169 | 210ms | A100-80G | ||
| vision-canary-7 | Running | mistral-7b | 2/6 | 3,540 | 592ms | A100-80G | ||
| vision-canary-10 | Stopped | stable-diffusion-xl | 1/7 | 1,133 | 3000ms | H100-80G | ||
| llm-staging-12 | Error | mistral-7b | 2/3 | 4,709 | 1725ms | A10-24G | ||
| multimodal-canary-8 | Running | llama-3.1-70b | 4/1 | 4,891 | 2733ms | A10-24G | ||
| llm-prod-6 | Stopped | stable-diffusion-xl | 1/3 | 667 | 1090ms | H100-80G | ||
| vision-canary-10 | Scaling | mistral-7b | 4/8 | 296 | 1217ms | A10-24G | ||
| vision-staging-9 | Deploying | stable-diffusion-xl | 3/8 | 711 | 388ms | A10-24G | ||
| llm-canary-17 | Stopped | whisper-large | 3/4 | 2,762 | 1812ms | A10-24G |