🔒 Permission Denied — Role Viewer has limited access. Some actions are disabled.
Inference Services
Deploy and manage model inference endpoints.
🌐
3,254
Total
✅
5,414
Healthy
⚠️
2,109
Degraded
❌
8,763
Down
| Name | Status | Model | Replicas | RPS | Latency P99 | GPU Type | Owner | |
|---|---|---|---|---|---|---|---|---|
| speech-prod-5 | Running | whisper-large | 1/4 | 1,104 | 2359ms | H100-80G | ||
| embed-canary-20 | Deploying | mistral-7b | 3/7 | 3,340 | 1326ms | A10-24G | ||
| llm-prod-6 | Running | llama-3.1-70b | 3/8 | 3,996 | 2098ms | A10-24G | ||
| llm-prod-3 | Running | whisper-large | 4/3 | 3,239 | 1989ms | A10-24G | ||
| llm-staging-12 | Scaling | llama-3.1-70b | 4/8 | 2,846 | 581ms | A10-24G | ||
| speech-canary-12 | Stopped | llama-3.1-70b | 4/8 | 3,455 | 2433ms | H100-80G | ||
| llm-canary-16 | Running | stable-diffusion-xl | 3/6 | 4,816 | 1538ms | H100-80G | ||
| speech-canary-9 | Deploying | whisper-large | 1/1 | 432 | 2469ms | H100-80G | ||
| speech-canary-5 | Running | stable-diffusion-xl | 1/2 | 1,802 | 2164ms | A100-80G | ||
| llm-prod-12 | Error | whisper-large | 1/4 | 4,086 | 1460ms | H100-80G |