Sovereign Architecture
Deep dive into the Atlas Runtime environment and how we guarantee data sovereignty at every layer.
Design Principles
Atlas is built on four non-negotiable principles that inform every architectural decision—from network topology to GPU scheduling.
Zero Data Egress
No data leaves the deployment boundary by default. Outbound connections are customer‑controlled.
Defense in Depth
mTLS, RBAC, and infrastructure activity journaling at every layer — not just the perimeter.
Isolation by Design
Isolation is enforced by deployment topology and configuration to prevent cross‑tenant exposure.
Operational Visibility
Infrastructure activity journaling provides operational visibility and integrity.
The Zero-Trust Enclave
Unlike traditional API providers that process data in a shared, multi-tenant public cloud, MX4 Atlas is built on a "Sovereign Enclave" architecture. The entire inference stack—from the load balancer to the GPU memory—runs within a strictly defined security boundary.
Data Flow Diagram
Every request passes through five security layers before reaching the model
Runtime Components
The Atlas Runtime is a self-contained inference engine. These are the key subsystems that handle every request.
API Gateway
Varies by deploymentTerminates mTLS, validates API keys, enforces tenant-level RBAC policies
Stack: Envoy proxy with custom auth filter
Residency Router
Varies by routing rulesRoutes sensitive workflows to local models to keep data within the boundary
Stack: Regex + NER model pipeline with Arabic morphology support
Routing Layer
ConfigurableResidency, rate limits, and workload isolation rules
Stack: Classifier ensemble + rule engine
Inference Engine
Depends on model and hardwareModel loading, batching, KV-cache management, and GPU scheduling across multi-GPU clusters
Stack: vLLM-based serving with continuous batching
Activity Journal
Async, non‑blockingWrites signed activity records for infrastructure events
Stack: Append-only log with integrity checks
Deployment Models
Atlas deploys exclusively on customer infrastructure. Choose a deployment model based on your connectivity and sovereignty requirements.
Private Cloud
Deploy Atlas Runtime into your existing VPC (AWS, Azure, GCP, Oracle). Full stack on your infrastructure, managed by your team.
- Terraform / Helm deployment for your VPC
- Auto-scaling based on GPU utilization
- Integrated with your existing IAM & secrets management
Air-Gapped
Physical deployment on your own hardware with zero internet connectivity. Maximum sovereignty for highly restricted environments.
- USB-based model distribution & updates
- Optional HSM key storage
- Offline license validation available
Atlas vs. Traditional API Providers
How a sovereign architecture differs from mainstream LLM API services.
| Characteristic | MX4 Atlas | Typical Cloud API |
|---|---|---|
| Data residency | Customer‑controlled | Provider region |
| Tenancy | Single‑tenant or dedicated GPU | Multi‑tenant shared |
| Network egress | Not required by default | Required for every call |
| Activity journal | Local append‑only journal | Provider‑managed logs |
| Data boundary control | Local routing policies | Sent to cloud |
| Model updates | Customer‑approved | Provider‑managed |
| Deployment control | Customer‑owned stack | Provider‑managed |
Data Sovereignty Guarantees
Atlas enforces data sovereignty at the infrastructure level — your data stays within your deployment boundary by default.
- Data Residency: Processing occurs on your infrastructure — cloud, on‑prem, or air‑gapped — based on your residency requirements.
- Infrastructure Activity Journal: Activity records can be signed to provide operational visibility and integrity.
- Zero External Calls: No outbound connections are required by default; egress is customer‑controlled.
- Encryption at Every Layer: Encryption in transit and at rest, with key management options aligned to your security posture.
- No Training on Your Data: Inference data is not used for model improvement; retention is controlled by your configuration.
Recommended hardware
Hardware sizing depends on model size, throughput targets, and deployment mode. We provide a sizing guide during pilots to align GPU, CPU, memory, and storage to your workloads.
- Dedicated GPU capacity for inference and batching
- High‑throughput storage for model artifacts and logs
- Network tuned to your deployment topology
What Atlas Does Not Do
Clarity about boundaries is as important as feature lists.
- Telemetry is off by default; any outbound reporting requires customer approval.
- Model updates require explicit customer approval.
- Isolation is enforced by deployment topology; dedicated resources are recommended for strict isolation.
- Prompt and completion retention is controlled by your configuration.
- Air‑gapped mode is supported; internet is not required for inference.