We are now part of the NVIDIA Inception Program.Read the announcement
Atlas Core API

The secure gateway for Sovereign AI

OpenAI-compatible API gateway for sovereign models. Route, secure, and observe every request from one endpoint.

POST/v1/chat/completions
HTTP
Host: api.your-domainAuth: mx4_sk_•••Content-Type: application/json
{
"model": "auto-router",
"messages": [{"role": "user", "content": "..." }],
"routing": {"profile": "balanced"}
}
200 OKResponse
~220ms
{
"id": "req_7f1a",
"model": "routed-model",
"choices": [{"message": {"role": "assistant", "content": "..."}}]
}

CORE API

Built for sovereign production

Routing and observability baked in for enterprise deployments.

Unified Interface

One endpoint for Arabic-native, multilingual, and your fine-tuned checkpoints.

Predictable Costs

Per-team budgets, token caps, and attribution you can review.

Activity Journal

Local infrastructure activity journal with residency evidence.

Low Latency

On-prem inference next to your data, not across oceans.

GET STARTED

Launch in 3 steps

Your first request in a few minutes.

Step 1Quick

Provision an API key

Create team-scoped keys, rotate on schedule, and set IP allowlists.

Step 2Quick

Point your base URL

Use the OpenAI SDKs and switch `baseURL` to your Atlas endpoint.

Step 3Quick

Send your first prompt

Pick a model or let routing handle cost, latency, and sovereignty.

INTEGRATION

Drop-in compatibility

Switch in minutes. Point `baseURL` to Atlas and keep the same OpenAI tooling.

Client Integration (Node.js / Python)TypeScript
import OpenAI from 'openai';

// Example: point the OpenAI client to Atlas
const client = new OpenAI({
  apiKey: 'mx4_sk_live_...',
  baseURL: 'https://api.your-domain/v1',
});

// Use standard Chat Completions
const completion = await client.chat.completions.create({
  model: 'auto-router',
  messages: [
    { role: 'user', content: 'لخص هذا التقرير المالي' }
  ],
  // Optional Atlas routing hints
  extra_body: {
    routing_preference: 'cost', // cost | performance | balanced
  }
});

console.log(completion.choices[0].message);

Supported Frameworks

LangChain
LlamaIndex
Semantic Kernel
AutoGen
Vercel AI SDK
Flowise

Supported Models

Falcon familyVariesArabic Native
JAIS familyVariesArabic Native
Llama familyVariesEnglish / Code
Qwen familyVariesMultilingual

DEVELOPER TOOLS

Official SDKs

Start fast with officially supported client libraries.

🐍

Python SDK

pip install mx4-atlasLatest
📦

Node.js SDK

npm install @mx4/atlasLatest
🐹

Go SDK

go get github.com/mx4/atlas-goLatest

Java SDK

maven: com.mx4.atlasLatest

ROUTING

Intelligent model routing

Rule-driven routing across cost, performance, language, and residency requirements.

  • Cost Optimization

    Route simple queries to smaller, cost-efficient models.

  • Data Sovereignty

    Ensure sensitive data stays on-prem by forcing local routes.

  • Language Specialization

    Automatically route Arabic prompts to specialized Arabic models.

routing_config.yamlYAML
# Example routing policy (illustrative)
routes:
  # Route Arabic to specialized models
  - name: "arabic-native"
    condition: "language == 'ar'"
    model: "arabic-large"
    
  # Cost optimization for simple queries
  - name: "fast-path"
    condition: "prompt_tokens < 100"
    model: "general-small"

API REFERENCE

Core endpoints

Representative endpoints with OpenAI-compatible semantics (may vary by deployment).

POST/v1/chat/completions
Standard chat generation with intelligent routing
POST/v1/embeddings
Vector embeddings (multilingual + Arabic-ready)
POST/v1/documents/upload
Ingest PDF/DOCX into sovereign vector store
POST/v1/rag/query
Retrieve context with citations and confidence scores
GET/v1/models
List available sovereign models
POST/v1/activity/logs
Retrieve infrastructure activity journal entries

Security & sovereignty headers

HeaderDescription
X-Sovereign-IDUnique identifier for the sovereign enclave processing the request.
X-Audit-Trace-IDTrace ID linking the request to the infrastructure activity journal.
X-Route-PolicyRouting policy applied (e.g., local-only, latency-optimized).
X-Data-ResidencyConfirmed location of data processing (e.g., "region-1").

Standard Error Codes

CodeMeaning
400 Bad RequestInvalid input or malformed JSON.
401 UnauthorizedInvalid or missing API key.
403 ForbiddenInsufficient permissions (RBAC) or data sovereignty rule violation.
429 Too Many RequestsRate limit exceeded (per-key or per-IP).
451 Unavailable For Legal ReasonsBlocked by Sovereignty Guard (e.g., data residency violation).

SECURITY

Authentication

API Key Authentication

All requests include a valid API key. Keys are scoped by team and tracked for usage and quotas.

Authentication ExampleBash
curl -X POST https://api.your-domain/v1/chat/completions \
  -H "Authorization: Bearer mx4_sk_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "model_chat",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Key Management

  • Create multiple keys per team for rotation
  • Set expiration dates and IP allowlists
  • Monitor usage per key in real-time
  • Revoke compromised keys instantly

mTLS (Enterprise)

  • Mutual TLS for zero-trust environments
  • Certificate-based authentication
  • Integrates with internal PKI
  • Required for air-gapped deployments
Sovereignty-grade logging

OBSERVABILITY

Infrastructure activity journal

Requests can be logged locally with a cryptographic hash chain and residency evidence.

Append-only

Entries are stored locally with hash-chaining for integrity.

Local retention

You control retention, access, and export on your infrastructure.

{
  "timestamp": "2026-02-04T14:23:45Z",
  "request_id": "req-uuid-12345",
  "actor_id": "user-xyz",
  "action": "chat.completions",
  "model": "routed-model",
  "prompt_hash": "sha256:a3f7d2...",
  "routing_policy": "sovereign_enforcement",
  "residency_boundary": "region-1",
  "journal_hash": "sha256:8b12c4...",
  "status": "success"
}

RATE LIMITS

Rate limits & quotas

Illustrative limits; configured per deployment.

PlanRequests/minTokens/dayBurst
DevelopmentLowLowSmall
ProfessionalMediumMediumModerate
EnterpriseCustomCustomCustom

Rate Limit Headers

Every response includes headers showing your current limits and remaining quota:

X-RateLimit-Limit: <limit>
X-RateLimit-Remaining: <remaining>
X-RateLimit-Reset: <unix_ts>

WEBHOOKS

Webhooks (beta)

Real-time notifications for async operations, routing rules, and cost thresholds.

Event Types

request.completed
Long-running request finished
routing.rule
Routing rule triggered
cost.threshold
Budget limit approached
model.fallback
Primary model failed, fallback used

Webhook Payload

Example PayloadJSON
{
  "event": "guardrail.violation",
  "timestamp": "2026-02-04T14:30:00Z",
  "request_id": "req_abc123",
  "details": {
    "violation_type": "pii_detected",
    "entities": ["email", "phone"],
    "action_taken": "request_blocked"
  },
  "metadata": {
    "team_id": "team_xyz",
    "user_id": "user_789"
  }
}

VERSIONING

Versioning & changelog

Current Version: v1 (stable)

Atlas follows semantic versioning. Exact versions are pinned per deployment.

Stable endpoints for production use

Recent highlights (example)

v1.x

Recent

  • Webhooks beta
  • Activity journal export improvements
  • Routing policy refinements

v1.x

Recent

  • Streaming support
  • Expanded routing controls
  • Performance optimizations

v1.x

Recent

  • Chat completions API
  • Document ingestion pipeline
  • mTLS authentication

FAQ

API FAQs

Common questions about integrating with Atlas.

Can I use the OpenAI Python library directly?

Yes. Change the base_url to your Atlas instance and use your Atlas API key. Our API is compatible with OpenAI's chat completions format.

How do I handle streaming responses?

Set stream: true in your request. Atlas returns Server-Sent Events (SSE) compatible with OpenAI's streaming format.

What happens if my quota is exceeded?

You'll receive a 429 error with headers indicating when your quota resets. Enterprise plans can include burst allowances.

Can I test the API without deploying on-prem?

We can provision a short-lived evaluation environment through the Test Access Program. Contact us for access.

How are embeddings priced differently from chat?

Pricing varies by model and volume. Contact us for the latest pricing guidance.