We are now part of the NVIDIA Inception Program.Read the announcement
Documentation

Telemetry & Observability

Monitor request flow, model performance, and infrastructure health inside your deployment boundary.

Last updated on February 16, 2026

Atlas ships with structured logs, metrics, and optional tracing so your team can operate reliably at scale. Telemetry is generated locally and stays within your infrastructure unless you explicitly export it.

Signals You Can Track

Request Metrics

Latency, throughput, token counts, and routing decisions.

Infrastructure Health

GPU/CPU utilization, memory pressure, queue depth, and errors.

Model Behavior

Completion length, streaming usage, cache hits, and fallbacks.

Activity Journaling

The infrastructure activity journal records API usage, configuration changes, and operational events. Journaling is optional and retention is controlled by your policies.

Enable journaling for production audits, incident response, or cost attribution. Disable it for strict zero‑retention environments.

Exporting Metrics

Metrics and logs can be routed to your observability stack through deployment configuration. Export destinations vary by environment and security posture.

yaml
1telemetry:
2 metrics:
3 enabled: true
4 sink: "internal"
5 logs:
6 enabled: true
7 retention_days: 14
8 tracing:
9 enabled: false

Operational Dashboards

SLO Monitoring

  • Track p50/p95 latency per model
  • Monitor queue depth and backpressure
  • Detect abnormal error spikes

Cost & Efficiency

  • Token usage by team or API key
  • Routing distribution across models
  • Cache hit rate and batching efficiency

Data Retention

Telemetry data stays inside your infrastructure by default. Retention windows and export rules are set by your security and operations teams.