ML model monitoring
Monitor inference volume, latency, and drift for each deployed ML model.
Before you start
- Deployed model with API key generated
- Some metrics appear only after first predict call (UI or API)
Per-model usage (recommended)
- Open Machine Learning and click your model.
- Generate API key if you have not already (deployment guide).
- Scroll to the Usage section on the model detail page.
You will see:
- Summary cards — total inferences, requests, average response time, endpoints used
- Charts — daily inference volume and response time, split by Heimdall UI vs API
- Request log — sortable table with endpoint, inference count, response time, channel, user agent, and drift % when available
- Filters — 7 / 30 / 90 day windows and endpoint filter
Use the request log to debug integration issues (wrong features, auth errors showing as zero traffic, latency spikes on specific routes).
Account-level monitoring
Open Usage (/usage) → Data Intelligence for workspace-wide trends:
- Daily volume stacked by ML, Forecast, Loop, Read, and Vision
- Channel mix (in-app tests vs production API)
- Top assets ranked by call volume across all deployed models
See Production monitoring for the full Usage page walkthrough.
What gets recorded
| Field | Meaning |
|---|---|
| Endpoint | REST path called (typically predict) |
| Inference count | Number of predictions in one request |
| Response time | Milliseconds to complete |
| Channel | Heimdall UI or API |
| Drift % | Performance drift indicator when enabled |
| User agent | Client string when present |
Next steps
- API integration guide — send production traffic
- Production monitoring — account-wide charts and platform health
- Deploy your model — generate keys if you have not deployed yet