We are now part of the NVIDIA Inception Program.Read the announcement
Documentation

Sovereign Architecture

Deep dive into the Atlas Runtime environment and how we guarantee data sovereignty at every layer.

Last updated on February 2, 2026

Design Principles

Atlas is built on four non-negotiable principles that inform every architectural decision—from network topology to GPU scheduling.

Zero Data Egress

No data leaves the deployment boundary. Ever. Not for telemetry, not for error reporting, not for model improvement.

Defense in Depth

mTLS, RBAC, and infrastructure activity journaling at every layer—not just the perimeter.

Hardware Isolation

Dedicated GPU memory per tenant. No shared caches, no shared VRAM, no cross-tenant inference.

Operational Visibility

Cryptographically signed activity journal entries for operational visibility.

The Zero-Trust Enclave

Unlike traditional API providers that process data in a shared, multi-tenant public cloud, MX4 Atlas is built on a "Sovereign Enclave" architecture. The entire inference stack—from the load balancer to the GPU memory—runs within a strictly defined security boundary.

Data Flow Diagram

Every request passes through five security layers before reaching the model

Client
mTLS 1.3 + HMAC
Gateway
Runtime
Air-Gapped Boundary
① API Key Validation & RBAC
② Residency Router
③ Rate Limiter & Guardrails
④ Activity Journal
⑤ Model Inference (GPU)H100 Cluster
Storage
Encrypted Activity Store (AES-256-GCM)Tamper-Proof

Runtime Components

The Atlas Runtime is a self-contained inference engine. These are the key subsystems that handle every request.

API Gateway

<2ms overhead

Terminates mTLS, validates API keys, enforces tenant-level RBAC policies

Stack: Envoy proxy with custom auth filter

Residency Router

~5ms per request

Routes sensitive workflows to local models to keep data within the boundary

Stack: Regex + NER model pipeline with Arabic morphology support

Guardrail Layer

~8ms per request

Routing rules for residency, rate limits, and workload isolation

Stack: Classifier ensemble + rule engine

Inference Engine

TTFT <200ms (Atlas Core, 8B)

Model loading, batching, KV-cache management, and GPU scheduling across multi-GPU clusters

Stack: vLLM-based serving with continuous batching

Activity Journal

Async, zero impact on inference

Writes cryptographically signed activity records for infrastructure events

Stack: Append-only log with SHA-256 chain

Deployment Models

Atlas deploys exclusively on customer infrastructure. Choose a deployment model based on your connectivity and sovereignty requirements.

Private Cloud

Deploy Atlas Runtime into your existing VPC (AWS, Azure, GCP, Oracle). Full stack on your infrastructure, managed by your team.

Your VPCZero PeeringFull Control
  • Terraform / Helm deployment in < 2 hours
  • Auto-scaling based on GPU utilization
  • Integrated with your existing IAM & secrets management

Air-Gapped

Physical deployment on your own hardware with zero internet connectivity. Maximum sovereignty for classified environments.

On-PremiseNo EgressComplete Isolation
  • USB-based model distribution & updates
  • Hardware Security Module (HSM) key storage
  • Offline license validation via signed tokens

Atlas vs. Traditional API Providers

How a sovereign architecture differs from mainstream LLM API services.

CharacteristicMX4 AtlasTypical Cloud API
Data residencyCustomer-controlledProvider region
TenancySingle-tenant or dedicated GPUMulti-tenant shared
Network egressZero — air-gap capableRequired for every call
Activity journalLocal append-only journalProvider-managed logs
Data boundary controlLocal routing rulesSent to cloud
Model updatesCustomer-approved via USB/privateAuto-updated by provider
Deployment controlCustomer-owned stackProvider-managed

Data Sovereignty Guarantees

Atlas enforces data sovereignty at the infrastructure level—your data never leaves your boundary.

  • Data Residency: All processing occurs on your infrastructure—cloud, on-prem, or air-gapped. No cross-border data movement.
  • Infrastructure Activity Journal: Cryptographically signed SHA-256 chain for operational visibility and integrity.
  • Zero External Calls: No outbound connections to MX4 or third parties during inference. No telemetry, no phone-home, no metrics collection.
  • Encryption at Every Layer: TLS 1.3 in transit, AES-256-GCM at rest. Model weights encrypted with customer-managed keys (BYOK).
  • No Training on Your Data: Inference data is never used for model improvement. Zero data retention post-response unless you configure local activity journaling.

Recommended Hardware

Atlas Runtime is optimized for NVIDIA H100 and A100 GPUs. Minimum specifications for production deployments:

ComponentMinimumRecommended
GPU2× NVIDIA A100 80GB4× NVIDIA H100 80GB
CPU32 vCPUs64 vCPUs (AMD EPYC)
RAM256 GB512 GB
Storage1 TB NVMe SSD2 TB NVMe RAID-1
Network10 Gbps25 Gbps (RDMA for multi-node)

What Atlas Does Not Do

Clarity about boundaries is as important as feature lists.

  • Atlas does not send telemetry or crash reports to MX4 servers.
  • Atlas does not auto-update models — all updates require explicit customer approval.
  • Atlas does not share GPU memory between tenants in any deployment mode.
  • Atlas does not retain prompt or completion data after response delivery.
  • Atlas does not require internet access for inference — air-gap is a first-class mode.