OpenAI-compatible API gateway for sovereign models. Route, secure, and observe every request from one endpoint.
POST /v1/chat/completions HTTP/1.1 Host: api.internal.mx4 Authorization: Bearer mx4_sk_... Content-Type: application/json { "model": "auto-router", "messages": [{"role": "user", "content": "..."}] }
CORE API
Routing and observability baked in for enterprise deployments.
One endpoint for Falcon, Jais, Llama, Qwen, and your fine-tuned checkpoints.
Per-team budgets, token caps, and attribution you can review.
Local infrastructure activity journal with residency evidence.
On-prem inference next to your data, not across oceans.
GET STARTED
Your first request in under five minutes.
Create team-scoped keys, rotate on schedule, and set IP allowlists.
Use the OpenAI SDKs and switch `baseURL` to your Atlas endpoint.
Pick a model or let routing handle cost, latency, and sovereignty.
INTEGRATION
Switch in minutes. Point `baseURL` to Atlas and keep the same OpenAI tooling.
import OpenAI from 'openai';
// 1) Point the OpenAI client to Atlas
const client = new OpenAI({
apiKey: 'mx4_sk_live_...',
baseURL: 'https://api.internal.mx4/v1',
});
// 2) Use standard Chat Completions
const completion = await client.chat.completions.create({
model: 'mx4-atlas-core',
messages: [
{ role: 'user', content: 'لخص هذا التقرير المالي' }
],
// Optional Atlas routing hints
extra_body: {
routing_preference: 'cost', // cost | performance | balanced
}
});
console.log(completion.choices[0].message);| Falcon-7B | 7B, 40B | Arabic Native |
| JAIS-13B | 13B | Arabic Native |
| Llama 3.1 | 8B, 70B | English / Code |
| Qwen-1.5-7B | 14B, 72B | Multilingual |
DEVELOPER TOOLS
Start fast with officially supported client libraries.
pip install mx4-atlasv1.2.4npm install @mx4/atlasv1.1.0go get github.com/mx4/atlas-gov0.9.8maven: com.mx4.atlasv1.0.1ROUTING
Rule-driven routing across cost, performance, language, and residency requirements.
Route simple queries to smaller, cheaper models like Llama 3 8B.
Ensure sensitive data stays on-prem by forcing local routes.
Automatically route Arabic prompts to Jais or Falcon-7B.
routes:
# Rule 1: Route Arabic to specialized model
- name: "arabic-native"
condition: "language == 'ar'"
model: "jais-13b"
# Rule 2: Cost optimization for simple queries
- name: "fast-path"
condition: "prompt_tokens < 100"
model: "llama-3-8b-quantized"API REFERENCE
Production-ready endpoints with OpenAI-compatible semantics.
| Header | Description |
|---|---|
| X-Sovereign-ID | Unique identifier for the sovereign enclave processing the request. |
| X-Audit-Trace-ID | Trace ID linking the request to the infrastructure activity journal. |
| X-Route-Policy | Routing policy applied (e.g., local-only, latency-optimized). |
| X-Data-Residency | Confirmed location of data processing (e.g., "KSA-Riyadh-Local"). |
| Code | Meaning |
|---|---|
| 400 Bad Request | Invalid input or malformed JSON. |
| 401 Unauthorized | Invalid or missing API key. |
| 403 Forbidden | Insufficient permissions (RBAC) or data sovereignty rule violation. |
| 429 Too Many Requests | Rate limit exceeded (per-key or per-IP). |
| 451 Unavailable For Legal Reasons | Blocked by Sovereignty Guard (e.g., data residency violation). |
SECURITY
All requests include a valid API key. Keys are scoped by team and tracked for usage and quotas.
curl -X POST https://api.mx4.ai/v1/chat/completions \
-H "Authorization: Bearer mx4_sk_live_abc123..." \
-H "Content-Type: application/json" \
-d '{
"model": "falcon-h1-7b",
"messages": [{"role": "user", "content": "Hello"}]
}'OBSERVABILITY
Requests can be logged locally with a cryptographic hash chain and residency evidence.
Entries are stored locally with hash-chaining for integrity.
You control retention, access, and export on your infrastructure.
{
"timestamp": "2026-02-04T14:23:45Z",
"request_id": "req-uuid-12345",
"actor_id": "user-xyz",
"action": "chat.completions",
"model": "falcon-h1-7b",
"prompt_hash": "sha256:a3f7d2...",
"routing_policy": "sovereign_enforcement",
"residency_boundary": "UAE On-Prem",
"journal_hash": "sha256:8b12c4...",
"status": "success"
}RATE LIMITS
Control spend with flexible limits per API key, team, or endpoint.
| Plan | Requests/min | Tokens/day | Burst |
|---|---|---|---|
| Development | 60 | 100K | 2x for 10s |
| Professional | 600 | 10M | 5x for 30s |
| Enterprise | Custom | Custom | Custom |
Every response includes headers showing your current limits and remaining quota:
WEBHOOKS
Real-time notifications for async operations, routing rules, and cost thresholds.
{
"event": "guardrail.violation",
"timestamp": "2026-02-04T14:30:00Z",
"request_id": "req_abc123",
"details": {
"violation_type": "pii_detected",
"entities": ["email", "phone"],
"action_taken": "request_blocked"
},
"metadata": {
"team_id": "team_xyz",
"user_id": "user_789"
}
}VERSIONING
Atlas follows semantic versioning. Minor updates are backwards compatible; major versions require migration.
2026-02-01
2026-01-15
2026-01-01
FAQ
Common questions about integrating with Atlas.
Yes! Just change the base_url to your Atlas instance and use your Atlas API key. Our API is 100% compatible with OpenAI's chat completions format.
Set stream: true in your request. Atlas returns Server-Sent Events (SSE) compatible with OpenAI's streaming format.
You'll receive a 429 error with X-RateLimit-Reset header indicating when your quota resets. Enterprise plans include burst allowances.
Yes. We offer a managed sandbox environment for development and testing. Contact sales for access.
Embeddings are charged per million tokens at 1/10th the cost of chat completions for the same model size.
SUPPORT
Hands-on support for integration, migration, and optimization.