Documentation

Sovereign Architecture

Deep dive into the Atlas Runtime environment and how we guarantee data sovereignty at every layer.

Last updated on February 2, 2026

Design Principles

Atlas is built on four non-negotiable principles that inform every architectural decision—from network topology to GPU scheduling.

Zero Data Egress

No data leaves the deployment boundary by default. Outbound connections are customer‑controlled.

Defense in Depth

mTLS, RBAC, and infrastructure activity journaling at every layer — not just the perimeter.

Isolation by Design

Isolation is enforced by deployment topology and configuration to prevent cross‑tenant exposure.

Operational Visibility

Infrastructure activity journaling provides operational visibility and integrity.

The Zero-Trust Enclave

Unlike traditional API providers that process data in a shared, multi-tenant public cloud, MX4 Atlas is built on a "Sovereign Enclave" architecture. The entire inference stack—from the load balancer to the GPU memory—runs within a strictly defined security boundary.

Data Flow Diagram

Every request passes through five security layers before reaching the model

Client

mTLS + signed requests

Gateway

Runtime

Air-Gapped Boundary

① API Key Validation & RBAC

② Residency Router

③ Rate Limiter & Routing Rules

④ Activity Journal

⑤ Model Inference (GPU)GPU Cluster

Storage

Encrypted Activity StoreIntegrity‑protected

Runtime Components

The Atlas Runtime is a self-contained inference engine. These are the key subsystems that handle every request.

API Gateway

Varies by deployment

Terminates mTLS, validates API keys, enforces tenant-level RBAC policies

Stack: Envoy proxy with custom auth filter

Residency Router

Varies by routing rules

Routes sensitive workflows to local models to keep data within the boundary

Stack: Regex + NER model pipeline with Arabic morphology support

Routing Layer

Configurable

Residency, rate limits, and workload isolation rules

Stack: Classifier ensemble + rule engine

Inference Engine

Depends on model and hardware

Model loading, batching, KV-cache management, and GPU scheduling across multi-GPU clusters

Stack: vLLM-based serving with continuous batching

Activity Journal

Async, non‑blocking

Writes signed activity records for infrastructure events

Stack: Append-only log with integrity checks

Deployment Models

Atlas deploys exclusively on customer infrastructure. Choose a deployment model based on your connectivity and sovereignty requirements.

Private Cloud

Deploy Atlas Runtime into your existing VPC (AWS, Azure, GCP, Oracle). Full stack on your infrastructure, managed by your team.

Your VPCZero PeeringFull Control

Terraform / Helm deployment for your VPC
Auto-scaling based on GPU utilization
Integrated with your existing IAM & secrets management

Air-Gapped

Physical deployment on your own hardware with zero internet connectivity. Maximum sovereignty for highly restricted environments.

On-PremiseNo EgressComplete Isolation

USB-based model distribution & updates
Optional HSM key storage
Offline license validation available

Atlas vs. Traditional API Providers

How a sovereign architecture differs from mainstream LLM API services.

Characteristic	MX4 Atlas	Typical Cloud API
Data residency	Customer‑controlled	Provider region
Tenancy	Single‑tenant or dedicated GPU	Multi‑tenant shared
Network egress	Not required by default	Required for every call
Activity journal	Local append‑only journal	Provider‑managed logs
Data boundary control	Local routing policies	Sent to cloud
Model updates	Customer‑approved	Provider‑managed
Deployment control	Customer‑owned stack	Provider‑managed

Data Sovereignty Guarantees

Atlas enforces data sovereignty at the infrastructure level — your data stays within your deployment boundary by default.

Data Residency: Processing occurs on your infrastructure — cloud, on‑prem, or air‑gapped — based on your residency requirements.
Infrastructure Activity Journal: Activity records can be signed to provide operational visibility and integrity.
Zero External Calls: No outbound connections are required by default; egress is customer‑controlled.
Encryption at Every Layer: Encryption in transit and at rest, with key management options aligned to your security posture.
No Training on Your Data: Inference data is not used for model improvement; retention is controlled by your configuration.

Recommended hardware

Hardware sizing depends on model size, throughput targets, and deployment mode. We provide a sizing guide during pilots to align GPU, CPU, memory, and storage to your workloads.

Dedicated GPU capacity for inference and batching
High‑throughput storage for model artifacts and logs
Network tuned to your deployment topology

What Atlas Does Not Do

Clarity about boundaries is as important as feature lists.

Telemetry is off by default; any outbound reporting requires customer approval.
Model updates require explicit customer approval.
Isolation is enforced by deployment topology; dedicated resources are recommended for strict isolation.
Prompt and completion retention is controlled by your configuration.
Air‑gapped mode is supported; internet is not required for inference.