Atlas Core API v1.0

The secure gateway for Sovereign AI

OpenAI-compatible API gateway for sovereign models. Route, secure, and observe every request from one endpoint.

POST /v1/chat/completions HTTP/1.1
Host: api.internal.mx4
Authorization: Bearer mx4_sk_...
Content-Type: application/json

{
  "model": "auto-router",
  "messages": [{"role": "user", "content": "..."}]
}

CORE API

Built for sovereign production

Routing and observability baked in for enterprise deployments.

Unified Interface

One endpoint for Falcon, Jais, Llama, Qwen, and your fine-tuned checkpoints.

Predictable Costs

Per-team budgets, token caps, and attribution you can review.

Activity Journal

Local infrastructure activity journal with residency evidence.

Low Latency

On-prem inference next to your data, not across oceans.

GET STARTED

Launch in 3 steps

Your first request in under five minutes.

Step 11 min

Provision an API key

Create team-scoped keys, rotate on schedule, and set IP allowlists.

Step 22 min

Point your base URL

Use the OpenAI SDKs and switch `baseURL` to your Atlas endpoint.

Step 32 min

Send your first prompt

Pick a model or let routing handle cost, latency, and sovereignty.

INTEGRATION

Drop-in compatibility

Switch in minutes. Point `baseURL` to Atlas and keep the same OpenAI tooling.

Client Integration (Node.js / Python)TypeScript

import OpenAI from 'openai';

// 1) Point the OpenAI client to Atlas
const client = new OpenAI({
  apiKey: 'mx4_sk_live_...',
  baseURL: 'https://api.internal.mx4/v1',
});

// 2) Use standard Chat Completions
const completion = await client.chat.completions.create({
  model: 'mx4-atlas-core',
  messages: [
    { role: 'user', content: 'لخص هذا التقرير المالي' }
  ],
  // Optional Atlas routing hints
  extra_body: {
    routing_preference: 'cost', // cost | performance | balanced
  }
});

console.log(completion.choices[0].message);

Supported Frameworks

LangChain

LlamaIndex

Semantic Kernel

AutoGen

Vercel AI SDK

Flowise

Supported Models

Falcon-7B	7B, 40B	Arabic Native
JAIS-13B	13B	Arabic Native
Llama 3.1	8B, 70B	English / Code
Qwen-1.5-7B	14B, 72B	Multilingual

DEVELOPER TOOLS

Official SDKs

Start fast with officially supported client libraries.

🐍

Python SDK

pip install mx4-atlasv1.2.4

📦

Node.js SDK

npm install @mx4/atlasv1.1.0

🐹

Go SDK

go get github.com/mx4/atlas-gov0.9.8

☕

Java SDK

maven: com.mx4.atlasv1.0.1

ROUTING

Intelligent model routing

Rule-driven routing across cost, performance, language, and residency requirements.

Cost Optimization
Route simple queries to smaller, cheaper models like Llama 3 8B.
Data Sovereignty
Ensure sensitive data stays on-prem by forcing local routes.
Language Specialization
Automatically route Arabic prompts to Jais or Falcon-7B.

routing_config.yamlYAML

routes:
  # Rule 1: Route Arabic to specialized model
  - name: "arabic-native"
    condition: "language == 'ar'"
    model: "jais-13b"
    
  # Rule 2: Cost optimization for simple queries
  - name: "fast-path"
    condition: "prompt_tokens < 100"
    model: "llama-3-8b-quantized"

API REFERENCE

Core endpoints

Production-ready endpoints with OpenAI-compatible semantics.

POST/v1/chat/completions

Standard chat generation with intelligent routing

POST/v1/embeddings

Vector embeddings (Multilingual-e5, Jais-embed)

POST/v1/documents/upload

Ingest PDF/DOCX into sovereign vector store

POST/v1/rag/query

Retrieve context with citations and confidence scores

GET/v1/models

List available sovereign models (Falcon, Jais, Llama)

POST/v1/activity/logs

Retrieve infrastructure activity journal entries

Security & sovereignty headers

Header	Description
X-Sovereign-ID	Unique identifier for the sovereign enclave processing the request.
X-Audit-Trace-ID	Trace ID linking the request to the infrastructure activity journal.
X-Route-Policy	Routing policy applied (e.g., local-only, latency-optimized).
X-Data-Residency	Confirmed location of data processing (e.g., "KSA-Riyadh-Local").

Standard Error Codes

Code	Meaning
400 Bad Request	Invalid input or malformed JSON.
401 Unauthorized	Invalid or missing API key.
403 Forbidden	Insufficient permissions (RBAC) or data sovereignty rule violation.
429 Too Many Requests	Rate limit exceeded (per-key or per-IP).
451 Unavailable For Legal Reasons	Blocked by Sovereignty Guard (e.g., data residency violation).

SECURITY

Authentication

API Key Authentication

All requests include a valid API key. Keys are scoped by team and tracked for usage and quotas.

Authentication ExampleBash

curl -X POST https://api.mx4.ai/v1/chat/completions \
  -H "Authorization: Bearer mx4_sk_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "falcon-h1-7b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Key Management

•Create multiple keys per team for rotation
•Set expiration dates and IP allowlists
•Monitor usage per key in real-time
•Revoke compromised keys instantly

mTLS (Enterprise)

•Mutual TLS for zero-trust environments
•Certificate-based authentication
•Integrates with internal PKI
•Required for air-gapped deployments

Sovereignty-grade logging

OBSERVABILITY

Infrastructure activity journal

Requests can be logged locally with a cryptographic hash chain and residency evidence.

Append-only

Entries are stored locally with hash-chaining for integrity.

Local retention

You control retention, access, and export on your infrastructure.

{
  "timestamp": "2026-02-04T14:23:45Z",
  "request_id": "req-uuid-12345",
  "actor_id": "user-xyz",
  "action": "chat.completions",
  "model": "falcon-h1-7b",
  "prompt_hash": "sha256:a3f7d2...",
  "routing_policy": "sovereign_enforcement",
  "residency_boundary": "UAE On-Prem",
  "journal_hash": "sha256:8b12c4...",
  "status": "success"
}

RATE LIMITS

Rate limits & quotas

Control spend with flexible limits per API key, team, or endpoint.

Plan	Requests/min	Tokens/day	Burst
Development	60	100K	2x for 10s
Professional	600	10M	5x for 30s
Enterprise	Custom	Custom	Custom

Rate Limit Headers

Every response includes headers showing your current limits and remaining quota:

X-RateLimit-Limit: 600

X-RateLimit-Remaining: 542

X-RateLimit-Reset: 1675432800

WEBHOOKS

Webhooks (beta)

Real-time notifications for async operations, routing rules, and cost thresholds.

Event Types

request.completed

Long-running request finished

routing.rule

Routing rule triggered

cost.threshold

Budget limit approached

model.fallback

Primary model failed, fallback used

Webhook Payload

Example PayloadJSON

{
  "event": "guardrail.violation",
  "timestamp": "2026-02-04T14:30:00Z",
  "request_id": "req_abc123",
  "details": {
    "violation_type": "pii_detected",
    "entities": ["email", "phone"],
    "action_taken": "request_blocked"
  },
  "metadata": {
    "team_id": "team_xyz",
    "user_id": "user_789"
  }
}

VERSIONING

Versioning & changelog

Current Version: v1.0

Atlas follows semantic versioning. Minor updates are backwards compatible; major versions require migration.

All endpoints stable and production-ready

Recent Changes

v1.0.3

2026-02-01

•Added Webhooks beta
•Improved activity journal export
•Fixed routing preference header edge case

v1.0.2

2026-01-15

•New models: Qwen 2.5 72B
•Streaming support for RAG endpoints
•Performance improvements

v1.0.1

2026-01-01

•Launch: Chat completions API
•Document ingestion pipeline
•mTLS authentication

FAQ

API FAQs

Common questions about integrating with Atlas.

Can I use the OpenAI Python library directly?

Yes! Just change the base_url to your Atlas instance and use your Atlas API key. Our API is 100% compatible with OpenAI's chat completions format.

How do I handle streaming responses?

Set stream: true in your request. Atlas returns Server-Sent Events (SSE) compatible with OpenAI's streaming format.

What happens if my quota is exceeded?

You'll receive a 429 error with X-RateLimit-Reset header indicating when your quota resets. Enterprise plans include burst allowances.

Can I test the API without deploying on-prem?

Yes. We offer a managed sandbox environment for development and testing. Contact sales for access.

How are embeddings priced differently from chat?

Embeddings are charged per million tokens at 1/10th the cost of chat completions for the same model size.

SUPPORT

Need help getting started?

Hands-on support for integration, migration, and optimization.

📚