Sovereign Architecture
Deep dive into the Atlas Runtime environment and how we guarantee data sovereignty at every layer.
Design Principles
Atlas is built on four non-negotiable principles that inform every architectural decision—from network topology to GPU scheduling.
Zero Data Egress
No data leaves the deployment boundary. Ever. Not for telemetry, not for error reporting, not for model improvement.
Defense in Depth
mTLS, RBAC, and infrastructure activity journaling at every layer—not just the perimeter.
Hardware Isolation
Dedicated GPU memory per tenant. No shared caches, no shared VRAM, no cross-tenant inference.
Operational Visibility
Cryptographically signed activity journal entries for operational visibility.
The Zero-Trust Enclave
Unlike traditional API providers that process data in a shared, multi-tenant public cloud, MX4 Atlas is built on a "Sovereign Enclave" architecture. The entire inference stack—from the load balancer to the GPU memory—runs within a strictly defined security boundary.
Data Flow Diagram
Every request passes through five security layers before reaching the model
Runtime Components
The Atlas Runtime is a self-contained inference engine. These are the key subsystems that handle every request.
API Gateway
<2ms overheadTerminates mTLS, validates API keys, enforces tenant-level RBAC policies
Stack: Envoy proxy with custom auth filter
Residency Router
~5ms per requestRoutes sensitive workflows to local models to keep data within the boundary
Stack: Regex + NER model pipeline with Arabic morphology support
Guardrail Layer
~8ms per requestRouting rules for residency, rate limits, and workload isolation
Stack: Classifier ensemble + rule engine
Inference Engine
TTFT <200ms (Atlas Core, 8B)Model loading, batching, KV-cache management, and GPU scheduling across multi-GPU clusters
Stack: vLLM-based serving with continuous batching
Activity Journal
Async, zero impact on inferenceWrites cryptographically signed activity records for infrastructure events
Stack: Append-only log with SHA-256 chain
Deployment Models
Atlas deploys exclusively on customer infrastructure. Choose a deployment model based on your connectivity and sovereignty requirements.
Private Cloud
Deploy Atlas Runtime into your existing VPC (AWS, Azure, GCP, Oracle). Full stack on your infrastructure, managed by your team.
- Terraform / Helm deployment in < 2 hours
- Auto-scaling based on GPU utilization
- Integrated with your existing IAM & secrets management
Air-Gapped
Physical deployment on your own hardware with zero internet connectivity. Maximum sovereignty for classified environments.
- USB-based model distribution & updates
- Hardware Security Module (HSM) key storage
- Offline license validation via signed tokens
Atlas vs. Traditional API Providers
How a sovereign architecture differs from mainstream LLM API services.
| Characteristic | MX4 Atlas | Typical Cloud API |
|---|---|---|
| Data residency | Customer-controlled | Provider region |
| Tenancy | Single-tenant or dedicated GPU | Multi-tenant shared |
| Network egress | Zero — air-gap capable | Required for every call |
| Activity journal | Local append-only journal | Provider-managed logs |
| Data boundary control | Local routing rules | Sent to cloud |
| Model updates | Customer-approved via USB/private | Auto-updated by provider |
| Deployment control | Customer-owned stack | Provider-managed |
Data Sovereignty Guarantees
Atlas enforces data sovereignty at the infrastructure level—your data never leaves your boundary.
- Data Residency: All processing occurs on your infrastructure—cloud, on-prem, or air-gapped. No cross-border data movement.
- Infrastructure Activity Journal: Cryptographically signed SHA-256 chain for operational visibility and integrity.
- Zero External Calls: No outbound connections to MX4 or third parties during inference. No telemetry, no phone-home, no metrics collection.
- Encryption at Every Layer: TLS 1.3 in transit, AES-256-GCM at rest. Model weights encrypted with customer-managed keys (BYOK).
- No Training on Your Data: Inference data is never used for model improvement. Zero data retention post-response unless you configure local activity journaling.
Recommended Hardware
Atlas Runtime is optimized for NVIDIA H100 and A100 GPUs. Minimum specifications for production deployments:
| Component | Minimum | Recommended |
|---|---|---|
| GPU | 2× NVIDIA A100 80GB | 4× NVIDIA H100 80GB |
| CPU | 32 vCPUs | 64 vCPUs (AMD EPYC) |
| RAM | 256 GB | 512 GB |
| Storage | 1 TB NVMe SSD | 2 TB NVMe RAID-1 |
| Network | 10 Gbps | 25 Gbps (RDMA for multi-node) |
What Atlas Does Not Do
Clarity about boundaries is as important as feature lists.
- Atlas does not send telemetry or crash reports to MX4 servers.
- Atlas does not auto-update models — all updates require explicit customer approval.
- Atlas does not share GPU memory between tenants in any deployment mode.
- Atlas does not retain prompt or completion data after response delivery.
- Atlas does not require internet access for inference — air-gap is a first-class mode.