We are now part of the NVIDIA Inception Program.Read the announcement
Documentation

Sovereign Architecture

Deep dive into the Atlas Runtime environment and how we guarantee data sovereignty at every layer.

Last updated on February 2, 2026

Design Principles

Atlas is built on four non-negotiable principles that inform every architectural decision—from network topology to GPU scheduling.

Zero Data Egress

No data leaves the deployment boundary by default. Outbound connections are customer‑controlled.

Defense in Depth

mTLS, RBAC, and infrastructure activity journaling at every layer — not just the perimeter.

Isolation by Design

Isolation is enforced by deployment topology and configuration to prevent cross‑tenant exposure.

Operational Visibility

Infrastructure activity journaling provides operational visibility and integrity.

The Zero-Trust Enclave

Unlike traditional API providers that process data in a shared, multi-tenant public cloud, MX4 Atlas is built on a "Sovereign Enclave" architecture. The entire inference stack—from the load balancer to the GPU memory—runs within a strictly defined security boundary.

Data Flow Diagram

Every request passes through five security layers before reaching the model

Client
mTLS + signed requests
Gateway
Runtime
Air-Gapped Boundary
① API Key Validation & RBAC
② Residency Router
③ Rate Limiter & Routing Rules
④ Activity Journal
⑤ Model Inference (GPU)GPU Cluster
Storage
Encrypted Activity StoreIntegrity‑protected

Runtime Components

The Atlas Runtime is a self-contained inference engine. These are the key subsystems that handle every request.

API Gateway

Varies by deployment

Terminates mTLS, validates API keys, enforces tenant-level RBAC policies

Stack: Envoy proxy with custom auth filter

Residency Router

Varies by routing rules

Routes sensitive workflows to local models to keep data within the boundary

Stack: Regex + NER model pipeline with Arabic morphology support

Routing Layer

Configurable

Residency, rate limits, and workload isolation rules

Stack: Classifier ensemble + rule engine

Inference Engine

Depends on model and hardware

Model loading, batching, KV-cache management, and GPU scheduling across multi-GPU clusters

Stack: vLLM-based serving with continuous batching

Activity Journal

Async, non‑blocking

Writes signed activity records for infrastructure events

Stack: Append-only log with integrity checks

Deployment Models

Atlas deploys exclusively on customer infrastructure. Choose a deployment model based on your connectivity and sovereignty requirements.

Private Cloud

Deploy Atlas Runtime into your existing VPC (AWS, Azure, GCP, Oracle). Full stack on your infrastructure, managed by your team.

Your VPCZero PeeringFull Control
  • Terraform / Helm deployment for your VPC
  • Auto-scaling based on GPU utilization
  • Integrated with your existing IAM & secrets management

Air-Gapped

Physical deployment on your own hardware with zero internet connectivity. Maximum sovereignty for highly restricted environments.

On-PremiseNo EgressComplete Isolation
  • USB-based model distribution & updates
  • Optional HSM key storage
  • Offline license validation available

Atlas vs. Traditional API Providers

How a sovereign architecture differs from mainstream LLM API services.

CharacteristicMX4 AtlasTypical Cloud API
Data residencyCustomer‑controlledProvider region
TenancySingle‑tenant or dedicated GPUMulti‑tenant shared
Network egressNot required by defaultRequired for every call
Activity journalLocal append‑only journalProvider‑managed logs
Data boundary controlLocal routing policiesSent to cloud
Model updatesCustomer‑approvedProvider‑managed
Deployment controlCustomer‑owned stackProvider‑managed

Data Sovereignty Guarantees

Atlas enforces data sovereignty at the infrastructure level — your data stays within your deployment boundary by default.

  • Data Residency: Processing occurs on your infrastructure — cloud, on‑prem, or air‑gapped — based on your residency requirements.
  • Infrastructure Activity Journal: Activity records can be signed to provide operational visibility and integrity.
  • Zero External Calls: No outbound connections are required by default; egress is customer‑controlled.
  • Encryption at Every Layer: Encryption in transit and at rest, with key management options aligned to your security posture.
  • No Training on Your Data: Inference data is not used for model improvement; retention is controlled by your configuration.

Recommended hardware

Hardware sizing depends on model size, throughput targets, and deployment mode. We provide a sizing guide during pilots to align GPU, CPU, memory, and storage to your workloads.

  • Dedicated GPU capacity for inference and batching
  • High‑throughput storage for model artifacts and logs
  • Network tuned to your deployment topology

What Atlas Does Not Do

Clarity about boundaries is as important as feature lists.

  • Telemetry is off by default; any outbound reporting requires customer approval.
  • Model updates require explicit customer approval.
  • Isolation is enforced by deployment topology; dedicated resources are recommended for strict isolation.
  • Prompt and completion retention is controlled by your configuration.
  • Air‑gapped mode is supported; internet is not required for inference.