We are now part of the NVIDIA Inception Program.Read the announcement
Documentation

Embeddings

Get a vector representation of a given input — optimized for Arabic morphology with dialect-aware tokenization.

Last updated on February 2, 2026
POSThttps://api.mx4.ai/v1/embeddings

Available Models

ModelDimensionsMax TokensArabic MTEBBest For
mx4-embed-v115368,19274.2%General purpose, semantic search
mx4-embed-v1-large30728,19278.6%High-precision retrieval, classification

Arabic MTEB scores measured on our internal benchmark suite covering MSA, Gulf, Levantine, and Egyptian dialects.

Request Body

inputstring | string[]Required

The text(s) to embed. Pass a single string or an array of up to 50 strings for batch processing.

modelstringRequired

ID of the model to use: 'mx4-embed-v1' (1536-dim) or 'mx4-embed-v1-large' (3072-dim).

encoding_formatstring

Output format: 'float' (default) or 'base64'. Use base64 for bandwidth-sensitive applications.

Request

bash
1curl https://api.mx4.ai/v1/embeddings \
2 -H "Authorization: Bearer $MX4_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "input": ["السيادة على البيانات مطلب أساسي", "Data sovereignty is essential"],
6 "model": "mx4-embed-v1"
7 }'

Response

json
1{
2 "object": "list",
3 "model": "mx4-embed-v1",
4 "data": [
5 {
6 "object": "embedding",
7 "index": 0,
8 "embedding": [0.0123, -0.0456, 0.0789, "...1536 floats"]
9 },
10 {
11 "object": "embedding",
12 "index": 1,
13 "embedding": [0.0234, -0.0567, 0.0891, "...1536 floats"]
14 }
15 ],
16 "usage": {
17 "prompt_tokens": 18,
18 "total_tokens": 18
19 }
20}

Arabic Embedding Notes

  • Root-Aware Tokenization: MX4 embeddings use a custom Arabic tokenizer that preserves morphological roots. The word "كتبوا" (they wrote) shares vector proximity with "كتاب" (book) and "مكتبة" (library) — something standard BPE tokenizers miss entirely.
  • Dialect Coverage: Trained on MSA, Gulf, Egyptian, Levantine, and Maghrebi corpora. Cross-dialect retrieval accuracy is ~12% higher than multilingual-e5-large.
  • Mixed-Language Support: Handles Arabic-English code-switching common in Gulf business contexts. Embedding a mixed sentence like "نحتاج meeting بعد الظهر" produces coherent vectors.
  • Diacritics Handling: Embeddings are stable with or without tashkeel (diacritical marks). وَلَد and ولد produce near-identical vectors.

Use Cases

Semantic Search

Find documents matching a query by comparing embedding vectors. Ideal for Arabic knowledge bases where keyword matching fails due to morphological complexity.

RAG Retrieval

Power retrieval-augmented generation by embedding document chunks and retrieving the top-k most relevant passages for context injection.

Duplicate Detection

Identify duplicate or near-duplicate documents using cosine similarity — across MSA and dialect variants of the same content.

Classification

Use embeddings as features for downstream classifiers — sentiment analysis, topic categorization, intent detection in Arabic customer support.

Best Practices

✓ Batch Requests

Send up to 50 texts in a single request for optimal throughput. Batching reduces per-text latency by ~60% compared to individual calls.

✓ Cache Embeddings

Store embeddings in a vector database (Qdrant, Weaviate, pgvector) to avoid recomputing. Embeddings are deterministic — the same input always produces the same output.

✓ Normalize Before Comparison

MX4 embeddings are already L2-normalized, so cosine similarity equals dot product. Use dot product for faster comparisons at scale.

✓ Chunk Arabic Text Carefully

Arabic sentences are longer than English after tokenization. Aim for 256–512 token chunks for retrieval. Use sentence boundaries, not fixed character counts.

Example: Semantic Search with Python

semantic_search.pypython
1import openai
2import numpy as np
3
4# MX4 is OpenAI-compatible — use the standard SDK
5client = openai.OpenAI(
6 api_key="mx4-sk-...",
7 base_url="https://api.mx4.ai/v1"
8)
9
10# Embed a query in Arabic
11query = "ما هي السيادة على البيانات؟" # What is data sovereignty?
12query_embedding = client.embeddings.create(
13 model="mx4-embed-v1",
14 input=query
15).data[0].embedding
16
17# Embed a document corpus (batch for efficiency)
18documents = [
19 "السيادة على البيانات تعني بقاء البيانات في البلد الذي تم جمعها فيه.",
20 "عاصمة فرنسا هي باريس.",
21 "التشفير يحمي البيانات أثناء النقل وفي حالة السكون.",
22 "يتطلب نظام حماية البيانات الشخصية (PDPL) معالجة البيانات محلياً.",
23]
24
25doc_response = client.embeddings.create(
26 model="mx4-embed-v1",
27 input=documents
28)
29doc_embeddings = [d.embedding for d in doc_response.data]
30
31# Dot product = cosine similarity (MX4 embeddings are L2-normalized)
32similarities = [np.dot(query_embedding, doc_emb) for doc_emb in doc_embeddings]
33ranked = sorted(enumerate(similarities), key=lambda x: x[1], reverse=True)
34
35for idx, score in ranked[:3]:
36 print(f"[{score:.3f}] {documents[idx]}")

Example: Batch Embeddings with Node.js

batch_embed.jsjavascript
1import OpenAI from "openai";
2
3const client = new OpenAI({
4 apiKey: process.env.MX4_API_KEY,
5 baseURL: "https://api.mx4.ai/v1",
6});
7
8async function embedDocuments(texts) {
9 const response = await client.embeddings.create({
10 model: "mx4-embed-v1",
11 input: texts, // Up to 50 texts per batch
12 });
13
14 return response.data.map((item) => ({
15 index: item.index,
16 embedding: item.embedding, // 1536-dim float array
17 }));
18}
19
20// Usage
21const docs = [
22 "نظام الحوكمة الرقمية في المملكة العربية السعودية",
23 "Digital governance framework in Saudi Arabia",
24];
25
26const embeddings = await embedDocuments(docs);
27console.log(`Embedded ${embeddings.length} documents (${embeddings[0].embedding.length} dimensions)`);

Rate Limits & Performance

Latency
<100ms
Single text, mx4-embed-v1
Batch Size
50 texts/request
Max 8,192 tokens per text
Rate Limit
3,000 RPM
Enterprise: unlimited