Rate Limits
Understand MX4 Atlas API rate limits and how to handle them effectively.
Rate Limit Overview
MX4 Atlas implements rate limiting to ensure fair usage and maintain service stability. Rate limits are applied per API key and are measured in requests per minute (RPM) and tokens per minute (TPM).
Default Limits (Starter Plan)
Rate Limit Headers
Every API response includes headers that indicate your current rate limit status:
x-ratelimit-limit-requestsMaximum requests per minutex-ratelimit-limit-tokensMaximum tokens per minutex-ratelimit-remaining-requestsRemaining requests in windowx-ratelimit-remaining-tokensRemaining tokens in windowx-ratelimit-reset-requestsTime until requests reset (Unix timestamp)x-ratelimit-reset-tokensTime until tokens reset (Unix timestamp)Handling Rate Limits
When you exceed rate limits, the API returns a 429 status code. Implement exponential backoff and retry logic in your applications.
1import time2import openai3from openai import OpenAI45client = OpenAI(6 api_key="mx4-sk-...",7 base_url="https://api.mx4.ai/v1"8)910def make_request_with_retry(messages, max_retries=3):11 for attempt in range(max_retries):12 try:13 response = client.chat.completions.create(14 model="mx4-atlas-core",15 messages=messages16 )17 return response18 except openai.RateLimitError as e:19 if attempt == max_retries - 1:20 raise e21 # Exponential backoff: wait 2^attempt seconds22 wait_time = 2 ** attempt23 print(f"Rate limited. Waiting {wait_time} seconds...")24 time.sleep(wait_time)25 return None
Advanced Retry Strategies
Exponential Backoff with Jitter
Reduce thundering herd by adding jitter to retry delays:
1import time2import random3import openai45def request_with_exponential_backoff(messages, max_retries=5):6 base_delay = 1 # Start with 1 second78 for attempt in range(max_retries):9 try:10 response = client.chat.completions.create(11 model="mx4-atlas-core",12 messages=messages13 )14 return response15 except openai.RateLimitError as e:16 if attempt == max_retries - 1:17 raise e1819 # Exponential backoff with jitter20 delay = base_delay * (2 ** attempt)21 jitter = random.uniform(0, delay * 0.1)22 total_delay = delay + jitter2324 print(f"Attempt {attempt + 1}: waiting {total_delay:.2f} seconds")25 time.sleep(total_delay)26 except openai.APIError as e:27 # Don't retry on non-rate-limit errors28 raise e
Monitoring & Optimization
Log Rate Limit Headers
Track remaining tokens/requests in your monitoring system to predict limit breaches.
Batch Requests
Group multiple queries into single API calls where possible to reduce request count.
Request Queue
Implement a request queue to smooth out traffic spikes and avoid rate limit bursts.
Token Optimization
Use shorter messages, remove unnecessary context, and implement caching for repeated queries.
Optimization Techniques
Request Queuing Pattern
1import asyncio2import time3from collections import deque45class RateLimitedQueue:6 def __init__(self, max_rpm=100, max_tpm=10000):7 self.max_rpm = max_rpm8 self.max_tpm = max_tpm9 self.queue = deque()10 self.request_times = []11 self.token_counts = []1213 async def add_request(self, messages, tokens_estimate=200):14 while True:15 now = time.time()16 # Clean old entries (older than 1 minute)17 self.request_times = [t for t in self.request_times if now - t < 60]18 self.token_counts = [t for t in self.token_counts if now - t < 60]1920 if (len(self.request_times) < self.max_rpm and21 sum(self.token_counts) + tokens_estimate < self.max_tpm):22 self.request_times.append(now)23 self.token_counts.append(tokens_estimate)24 return2526 # Wait before retry27 await asyncio.sleep(1)
Monitoring Best Practices
Alert on Thresholds
Set alerts when remaining tokens drop below 20% of limit to proactively manage load.
Track Usage Patterns
Monitor peak usage times and adjust request distribution to avoid consistent bottlenecks.
Upgrade Planning
If consistently hitting 80%+ of limits, upgrade your plan for better performance and cost efficiency.
Increasing Limits
Higher rate limits are available for enterprise customers. Contact our sales team to discuss your requirements and upgrade your plan.
Plan Limits
Sales: sales@mx4.ai