Our Agents Run in the Gulf.
We Know Where They Fail.
Production data from agents working with Gulf businesses. We see exactly where models break, and how to fix them.
{ "source": "banking_agent_prod", "dialect": "gulf_najdi", "failure_type": "dialectal_confusion", "user_input": "ابي اشتري جوال جديد", "model_read_as": "my_father_buying_phone", "actual_intent": "i_want_to_buy_phone", "correction": { "rlhf_pair": true, "expert_validated": true } }
The Arabic AI Market Gap
The Arab AI market is projected to reach $320B by 2030. But without quality Arabic data, models will never truly understand the region.
Arabic Speaker Distribution by Dialect Group
The Challenge Labs Face
Research Impact
Studies demonstrate that dialect-specific fine-tuning can nearly double model accuracy on Arabic tasks.
Voice AI is Broken in Arabic
Voice-to-voice models are the next frontier. But Arabic voice infrastructure is 12-18 months behind - and you can't fix it from Silicon Valley.
ASR Performance Crisis
The Problem: Commercial ASR shows 57-123% higher word error rates on dialectal Arabic vs MSA, making voice agents unusable for 420M+ speakers.
Native Dialect Speakers
Our Voice Infrastructure
Why This Can't Be Solved From Silicon Valley
“GCC governments are investing $100B+ in AI. Voice agents are the priority. The infrastructure doesn't exist yet - we're building it.”
We catch what others miss.
Every agent interaction surfaces real failure patterns. Dialectal misunderstandings. Cultural context gaps. Domain-specific errors. We turn each one into training data.
'ابي' (aby) means 'I want' in Gulf Arabic, but in MSA it means 'my father'. Models trained on formal Arabic completely misinterpret this common phrase.
How it becomes training data
Capture failure from production
Categorize by failure type
Expert validates correction
Package as RLHF data
Training Data That Works
We work with labs to create verified, dialect-specific training data across multiple formats optimized for your training pipeline.
High-quality (instruction, response) pairs verified across multiple dimensions. Each pair includes dialect classification, grammatical validation, and domain accuracy scores.
{
"instruction": "اكتب رسالة رسمية للعميل تشرح فيها أسباب رفض طلب القرض الشخصي مع الحفاظ على لهجة خليجية مهنية ومطابقة لمتطلبات ساما",
"output": "بسم الله الرحمن الرحيم\n\nالسيد محمد المحترم،\n\nبعد مراجعة طلبكم للقرض الشخصي...",
"metadata": {
"dialect": "gulf",
"dialect_score": 0.92,
"domain": "banking",
"sama_compliant": true,
"grammatical_score": 0.95
}
}Task Library
Task definitions across dialects and domains
Complete task definitions with instructions in multiple dialects, verification criteria, oracle solutions, and domain-specific context.
How We Ensure Data Quality
Every piece of data passes through 6 verification stages. This is what separates generic data from real training data.
Dialect Classification
Our expert-built classification system identifies output across 7 Arabic dialects with calibrated confidence scores from human validation.
Our data is worth
5-10x typical Arabic data
Task Specification Schema
Every task follows a rigorous specification schema with multi-dimensional evaluation rubrics and oracle annotation protocols.
{
"task_id": "awn-gulf-banking-001",
"domain": {
"primary": "banking",
"regulatory": ["sama", "cbuae"]
},
"linguistic_config": {
"dialect": "gulf_najdi",
"register": "formal_with_dialect",
"code_switching": { "max_ratio": 0.15 }
},
"evaluation_rubric": {
"dialect_authenticity": 0.25,
"grammatical_correctness": 0.15,
"task_completion": 0.25,
"cultural_alignment": 0.15,
"domain_accuracy": 0.10,
"regulatory_compliance": 0.10
}
}Quality Metrics
Multi-dimensional evaluation
Why Choose Awn
We're not just data vendors. We're your partners in making AI models truly understand Arabic.
What We Bring
| Capability | In-House | With Awn |
|---|---|---|
| Gulf Arabic speakers | Hard to hire | Network of 1,500+ |
| Compliance knowledge | Requires Saudi experts | Domain expert validated |
| Dialect classification | Would use biased models | Expert-built classification system |
| Multi-dimensional rewards | Don't know dimensions | Research-backed taxonomy |
Partnership Benefits
On-Ground Presence
Operations across Saudi Arabia, UAE, and Egypt with native teams.
Production Feedback
Real enterprise deployments generating insights about model failures.
Expert Network
Verified domain experts across healthcare, finance, government.
Continuous Iteration
Not a one-time handoff. Ongoing collaboration as your models improve.
Our philosophy: We're not a data company trying to understand Arabic. We're an Arabic AI company that happens to have the best data.
Partnership Models
We offer flexible collaboration models designed to help you improve Arabic capabilities at any stage.
Pilot Project
Getting Started
Start with a focused pilot to validate the partnership. Test our data quality and see the impact on your Arabic benchmarks.
Research Collaboration
Joint Research
Partner on Arabic AI research. Co-develop evaluation benchmarks, publish findings, and push the state of the art together.
Strategic Partnership
Long-term
Deep integration for ongoing Arabic AI improvement. Continuous data pipelines, dedicated teams, and strategic alignment.
Let's Talk
Every lab has unique needs. We'd love to understand yours and design a collaboration that works.