Awn
Labs

Our Agents Run in the Gulf.
We Know Where They Fail.

Production data from agents working with Gulf businesses. We see exactly where models break, and how to fix them.

production_insights.json
{
  "source": "banking_agent_prod",
  "dialect": "gulf_najdi",
  "failure_type": "dialectal_confusion",
  "user_input": "ابي اشتري جوال جديد",
  "model_read_as": "my_father_buying_phone",
  "actual_intent": "i_want_to_buy_phone",
  "correction": {
    "rlhf_pair": true,
    "expert_validated": true
  }
}
The Opportunity

The Arabic AI Market Gap

The Arab AI market is projected to reach $320B by 2030. But without quality Arabic data, models will never truly understand the region.

$0B+
GCC AI investment
0M
Arabic speakers
$0B
Arab AI market by 2030
0.0%
performance gap

Arabic Speaker Distribution by Dialect Group

Egyptian Arabic120M
Maghrebi Arabic90M
Gulf Arabic58M
Levantine Arabic48M
420Mtotal Arabic speakers0% speak MSA natively

The Challenge Labs Face

89% of Arabic training data is MSA, but 0% of native speakers use it daily
Dialectal speech patterns differ drastically from written Arabic
GCC business context requires domain-specific knowledge (SAMA, ZATCA, etc.)
Existing multilingual benchmarks miss Arabic-specific failure modes
Building this infrastructure in-house takes 12-18+ months

Research Impact

47.97% → 84.21%
Dialect Fine-tuning Impact

Studies demonstrate that dialect-specific fine-tuning can nearly double model accuracy on Arabic tasks.

Critical Bottleneck

Voice AI is Broken in Arabic

Voice-to-voice models are the next frontier. But Arabic voice infrastructure is 12-18 months behind - and you can't fix it from Silicon Valley.

12-18mo
behind English voice AI
95-99%
less dialectal voice data
<1Khrs
quality dialectal audio
60%+
of ME messages are voice

ASR Performance Crisis

Word Error Rate by Dialect
Production Target <10%
MSA (Standard)12.5%
Egyptian27.5%
Gulf33.8%
Levantine30%
Sudanese57.1%
AcceptableNeeds workUnusable

The Problem: Commercial ASR shows 57-123% higher word error rates on dialectal Arabic vs MSA, making voice agents unusable for 420M+ speakers.

Native Dialect Speakers

Our Voice Infrastructure

Professional recording studios across GCC
Native speaker talent pools per dialect
Acoustic quality control pipelines
Scalable annotation workflows

Why This Can't Be Solved From Silicon Valley

No standard orthography for dialectal Arabic
Mutual unintelligibility between dialects
Code-switching with English/French
Cultural context embedded in phonetics
Requires native speaker intuition
Regional studio infrastructure needed

GCC governments are investing $100B+ in AI. Voice agents are the priority. The infrastructure doesn't exist yet - we're building it.

Real Data From Real Failures

We catch what others miss.

Every agent interaction surfaces real failure patterns. Dialectal misunderstandings. Cultural context gaps. Domain-specific errors. We turn each one into training data.

What we capture
Dialectal Failure
Original Input
ابي اشتري جوال جديد
Issue Identified

'ابي' (aby) means 'I want' in Gulf Arabic, but in MSA it means 'my father'. Models trained on formal Arabic completely misinterpret this common phrase.

Data Generated
SFT DataRLHF/DPO PairExpert Validated
Each failure is expert-validated before becoming training data

How it becomes training data

01

Capture failure from production

02

Categorize by failure type

03

Expert validates correction

04

Package as RLHF data

What We Offer

Training Data That Works

We work with labs to create verified, dialect-specific training data across multiple formats optimized for your training pipeline.

High-quality (instruction, response) pairs verified across multiple dimensions. Each pair includes dialect classification, grammatical validation, and domain accuracy scores.

Multi-tier quality verification ensures accuracy
Metadata includes dialect scores and compliance flags
Ready for direct fine-tuning pipelines
sft_training_data.json
{
  "instruction": "اكتب رسالة رسمية للعميل تشرح فيها أسباب رفض طلب القرض الشخصي مع الحفاظ على لهجة خليجية مهنية ومطابقة لمتطلبات ساما",
  "output": "بسم الله الرحمن الرحيم\n\nالسيد محمد المحترم،\n\nبعد مراجعة طلبكم للقرض الشخصي...",
  "metadata": {
    "dialect": "gulf",
    "dialect_score": 0.92,
    "domain": "banking",
    "sama_compliant": true,
    "grammatical_score": 0.95
  }
}

Task Library

Task definitions across dialects and domains

Complete task definitions with instructions in multiple dialects, verification criteria, oracle solutions, and domain-specific context.

7
Dialects
10+
Domains
10K+
Patterns
Instructions in 7 dialects (Gulf, Egyptian, Levantine, Maghrebi, Sudanese, Yemeni, Iraqi)
Verification criteria and rubrics
Expert-written oracle solutions
Domain-specific context and compliance rules
Coverage
Banking & Finance
Healthcare
E-commerce
Government
HR & Recruitment
Customer Service
Legal
Education
More Domains
Quality Guarantee

How We Ensure Data Quality

Every piece of data passes through 6 verification stages. This is what separates generic data from real training data.

01

Dialect Classification

Our expert-built classification system identifies output across 7 Arabic dialects with calibrated confidence scores from human validation.

7 Arabic dialects supported
Confidence calibration from human validation
Detects code-switching patterns
Dialects covered7 dialects
Expert-verified at every stageStage 1 of 6

Our data is worth

5-10x typical Arabic data

Technical Deep-Dive

Task Specification Schema

Every task follows a rigorous specification schema with multi-dimensional evaluation rubrics and oracle annotation protocols.

task_specification.json
{
  "task_id": "awn-gulf-banking-001",
  "domain": {
    "primary": "banking",
    "regulatory": ["sama", "cbuae"]
  },
  "linguistic_config": {
    "dialect": "gulf_najdi",
    "register": "formal_with_dialect",
    "code_switching": { "max_ratio": 0.15 }
  },
  "evaluation_rubric": {
    "dialect_authenticity": 0.25,
    "grammatical_correctness": 0.15,
    "task_completion": 0.25,
    "cultural_alignment": 0.15,
    "domain_accuracy": 0.10,
    "regulatory_compliance": 0.10
  }
}

Quality Metrics

Multi-dimensional evaluation

Dialect Authenticity0.94
Grammatical Correctness0.97
Task Completion0.91
Cultural Alignment0.96
Regulatory Compliance1.00
Weighted Aggregate
0.956High Quality
Expert-validated metricsGulf Dialect
SFT
SFT Pairs2-3 per task
RL
RLHF Tuples4 per task
DPO
DPO Triplets3-6 per task
Partnership

Why Choose Awn

We're not just data vendors. We're your partners in making AI models truly understand Arabic.

What We Bring

CapabilityIn-HouseWith Awn
Gulf Arabic speakers
Hard to hire
Network of 1,500+
Compliance knowledge
Requires Saudi experts
Domain expert validated
Dialect classification
Would use biased models
Expert-built classification system
Multi-dimensional rewards
Don't know dimensions
Research-backed taxonomy

Partnership Benefits

On-Ground Presence

Operations across Saudi Arabia, UAE, and Egypt with native teams.

Production Feedback

Real enterprise deployments generating insights about model failures.

Expert Network

Verified domain experts across healthcare, finance, government.

Continuous Iteration

Not a one-time handoff. Ongoing collaboration as your models improve.

Our philosophy: We're not a data company trying to understand Arabic. We're an Arabic AI company that happens to have the best data.

Work With Us

Partnership Models

We offer flexible collaboration models designed to help you improve Arabic capabilities at any stage.

Pilot Project

Getting Started

Start with a focused pilot to validate the partnership. Test our data quality and see the impact on your Arabic benchmarks.

Scoped dialect or domain focus
Sample data with full verification
Benchmark results comparison
Start a Pilot

Research Collaboration

Joint Research

Partner on Arabic AI research. Co-develop evaluation benchmarks, publish findings, and push the state of the art together.

Joint benchmark development
Shared research publication
Access to production insights
Explore Research

Strategic Partnership

Long-term

Deep integration for ongoing Arabic AI improvement. Continuous data pipelines, dedicated teams, and strategic alignment.

Continuous data generation
Dedicated expert team
Roadmap collaboration
Discuss Partnership

Let's Talk

Every lab has unique needs. We'd love to understand yours and design a collaboration that works.

Book a Demo
Prefer email? [email protected]
No commitment required
30-min technical deep dive
NDA available