🔒 Permission Denied — Role Viewer has limited access. Some actions are disabled.
Endpoints
Manage deployed model inference endpoints.
🌐
3,704
Total
✅
3,172
Healthy
⚠️
2,826
Degraded
❌
449
Down
| Name | Status | Model | Replicas | RPS | Latency P99 | GPU Type | |
|---|---|---|---|---|---|---|---|
| multimodal-staging-v4 | Scaling | llama-3.1-70b | 2 | 239 | 411ms | A10-24G | |
| llm-canary-v4 | Active | stable-diffusion-xl | 4 | 1,690 | 1582ms | H100-80G | |
| embedding-staging-v4 | Error | whisper-large | 3 | 3,149 | 1850ms | A100-80G | |
| audio-prod-v5 | Active | mistral-7b | 8 | 769 | 947ms | H100-80G | |
| vision-staging-v4 | Stopped | mistral-7b | 7 | 3,455 | 1659ms | A10-24G | |
| embedding-staging-v1 | Stopped | stable-diffusion-xl | 8 | 4,711 | 283ms | A10-24G | |
| multimodal-prod-v2 | Error | whisper-large | 2 | 2,767 | 1284ms | A100-80G | |
| llm-staging-v4 | Scaling | stable-diffusion-xl | 8 | 4,784 | 805ms | A10-24G | |
| llm-staging-v5 | Active | mistral-7b | 5 | 3,351 | 118ms | A10-24G | |
| vision-prod-v5 | Stopped | llama-3.1-70b | 1 | 3,468 | 340ms | A10-24G |