Every Script.
Every Language.

Verify identities from anywhere in the world. Our multi-script OCR engine reads documents in 100+ languages—from Arabic to Chinese to Cyrillic—with the same precision and speed.

ALatin
عArabic
Chinese
ДCyrillic
Hindi
Korean
OCR Engine

Truly Global Verification

The world doesn't write in just one alphabet. Our OCR engine handles every major writing system on Earth, so you can verify customers from any country.

100+Languages Supported
12+Script Families
<300msExtraction Speed
RTLFull Right-to-Left Support

Every Writing System, Covered

From left-to-right alphabets to right-to-left scripts to complex logographic systems—we read them all.

Latin Scripts

English, Spanish, French, German, Portuguese, Italian, Dutch, Polish, and dozens more—including accents and diacritics.

HelloOláBonjourGuten Tag

Arabic & RTL Scripts

Full right-to-left support for Arabic, Persian, Urdu, and Hebrew—including mixed-direction text and bidirectional handling.

مرحباשלוםخوش آمدید

CJK Languages

Chinese (Simplified and Traditional), Japanese (Kanji, Hiragana, Katakana), and Korean—including vertical text orientation.

你好こんにちは안녕하세요

Cyrillic Scripts

Russian, Ukrainian, Bulgarian, Serbian, and other Slavic languages using the Cyrillic alphabet.

ПриветЗдравоВітаю

Indic Scripts

Hindi, Bengali, Tamil, Telugu, Gujarati, and other languages from the Indian subcontinent—including Aadhaar card support.

नमस्तेவணக்கம்નમસ્તે

Southeast Asian

Thai, Vietnamese, Lao, Khmer, and other Southeast Asian scripts with their unique character systems.

สวัสดีXin chàoສະບາຍດີ

Why Multi-Script OCR is Hard

Most OCR engines are trained on Latin characters. When they encounter a Russian passport, Saudi ID, or Chinese driver's license, accuracy plummets. Our engine was built from the ground up for global documents.

Text Direction

Arabic and Hebrew read right-to-left, mixing with left-to-right numbers

Character Complexity

CJK languages have thousands of unique characters vs 26 in English

Connected Scripts

Arabic letters change shape based on their position in a word

Diacritical Marks

Accents, vowel marks, and tone indicators that change meaning

Intelligent Script Processing

Our multi-script engine automatically detects, routes, and processes documents through the optimal pipeline for each script type.

1

Script Detection

Automatically identifies the script family before text extraction begins, ensuring optimal processing.

2

Specialized Extraction

Routes to script-optimized models trained specifically for each writing system's unique patterns.

3

Text Normalization

Standardizes output with proper Unicode handling, text direction, and character normalization.

4

Data Extraction

Extracts structured fields like name, DOB, and document number regardless of language.

Global Document Coverage

From passports to national IDs to driver's licenses—we extract data from government-issued documents worldwide.

Americas

  • US Passports & Driver's Licenses
  • Canadian IDs & Passports
  • Mexican IFE/INE Cards
  • Brazilian CPF & RG Documents

Europe

  • EU National ID Cards
  • UK Passports & Driving Licences
  • German Personalausweis
  • French Carte d'Identité

Middle East

  • Saudi National ID (Absher)
  • UAE Emirates ID
  • Israeli Teudat Zehut
  • Egyptian National ID

Asia Pacific

  • Chinese Resident ID (身份证)
  • Japanese Passports & Driver's Licenses
  • Korean National ID (주민등록증)
  • Indian Aadhaar & PAN Cards

Eastern Europe & Central Asia

  • Russian Internal Passport
  • Ukrainian ID Cards
  • Kazakh National ID
  • Turkish Nüfus Cüzdanı

Southeast Asia

  • Thai National ID Card
  • Vietnamese CCCD
  • Philippine PhilSys ID
  • Indonesian KTP

The Owl Eyes Advantage

Lightning Fast

Sub-second extraction regardless of script complexity. No delays for your users.

High Accuracy

Trained on millions of real documents. Handles noise, blur, and low-quality images.

Automatic Detection

No need to specify document type or language. We detect and route automatically.

Unicode Normalized

Clean, standardized output with proper character encoding for downstream processing.

Confidence Scoring

Per-field confidence scores so you know exactly how reliable each extraction is.

Seamless Integration

Multi-script OCR runs automatically on every document. Same API, global capabilities.

Handles Mixed-Script Documents

Many international documents contain multiple scripts—Arabic name with English transliteration, Chinese characters with Pinyin, Cyrillic with Latin. Our engine processes all scripts on a single document seamlessly.

Arabic + English
محمد أحمدMohammed Ahmed
Chinese + Pinyin
张伟Zhang Wei
Russian + Latin
ИвановIvanov

Built for Global Business

Global Banking

Onboard customers from any country with consistent KYC compliance regardless of document language.

Travel & Hospitality

Check in international guests instantly. Read passports from any country without friction.

Global Marketplaces

Verify sellers and buyers worldwide with the same speed and accuracy as domestic users.

Remote Hiring

Verify international candidates' identities during remote onboarding, regardless of their home country.

Mobility Services

Verify international drivers' licenses for ride-sharing, car rental, and fleet management.

Healthcare

Verify patient identities globally for telemedicine and international healthcare services.

One API, 100+ Languages

Multi-script OCR is built into every Owl Eyes verification. No extra configuration, no language parameters to set—just send the document and we handle the rest.

Automatic script and language detection
Consistent JSON output structure for all documents
Works with existing SDK integrations
Per-field confidence scores included
// Standard verification call
const result = await owlEyes.verify(sessionId);

// Multi-script OCR runs automatically
console.log(result.extractedData);
// {
//   name: {
//     full: "محمد أحمد",
//     latin: "Mohammed Ahmed",
//     confidence: 0.97
//   },
//   dateOfBirth: "1985-03-15",
//   documentNumber: "A12345678",
//   detectedScripts: ["arabic", "latin"],
//   extractionConfidence: 0.96
// }

Go Global with Confidence

Multi-script OCR is live and ready to verify documents from anywhere in the world. Expand your reach without expanding your complexity.

Learn More

Technical Specifications

Supported Scripts

  • Latin (English, Spanish, French, German, etc.)
  • Arabic (RTL with bidirectional support)
  • Hebrew (RTL)
  • Cyrillic (Russian, Ukrainian, Bulgarian)
  • Chinese (Simplified & Traditional)
  • Japanese (Kanji, Hiragana, Katakana)
  • Korean (Hangul)
  • Devanagari (Hindi, Sanskrit, Nepali)
  • Tamil, Telugu, Kannada, Malayalam
  • Thai, Greek, Georgian, Armenian

Performance

  • Processing: <300ms per document
  • Latin accuracy: 96-98%
  • Arabic accuracy: 94%+
  • CJK accuracy: 96%+
  • Cyrillic accuracy: 95%+
  • Script detection: >99%
  • Batch processing supported

Document Types

  • Passports (all countries)
  • National ID cards
  • Driver's licenses
  • Residence permits
  • Travel documents
  • Visa documents
  • Birth certificates

Output Format

  • Unicode-normalized (NFC)
  • Per-field confidence scores
  • Bounding box coordinates
  • Detected script types
  • Original + transliterated names
  • Consistent JSON structure
  • Webhook delivery

Frequently Asked Questions

What is multi-script OCR?

Multi-script OCR (Optical Character Recognition) is technology that reads and extracts text from documents written in multiple writing systems—not just English, but also Arabic, Chinese, Hindi, Russian, Japanese, Korean, and other scripts. Owl Eyes supports 109+ languages across 12+ script families.

Which languages does Owl Eyes OCR support?

Owl Eyes supports 109+ languages including English, Spanish, French, German, Arabic, Hebrew, Persian, Chinese (Simplified/Traditional), Japanese, Korean, Russian, Ukrainian, Hindi, Tamil, Telugu, Thai, Greek, and many more. We support all major writing systems used in government-issued identity documents.

How does Owl Eyes handle Arabic and Hebrew documents?

Our OCR engine includes specialized RTL (right-to-left) processing with the Unicode Bidirectional Algorithm. This ensures proper text ordering for Arabic, Hebrew, Persian, and Urdu documents—including mixed-direction text where numbers and Latin characters appear alongside RTL text.

What is the accuracy of multi-script OCR?

Owl Eyes achieves 96-98% accuracy on Latin scripts, 94%+ on Arabic, 96%+ on CJK languages, and 95%+ on Cyrillic. Our models are specifically trained on identity documents including passports, national IDs, and driver's licenses from countries worldwide.

How fast is document processing?

Owl Eyes processes documents in under 300 milliseconds regardless of script complexity. This is 10x faster than cloud OCR APIs like Google Document AI or AWS Textract, which typically take 2-3 seconds per document.

Is there additional cost for multi-script OCR?

No. Multi-script OCR is included with every Owl Eyes verification at no additional charge. Unlike cloud OCR APIs that charge $1.50-$2.00 per 1,000 documents, OCR is built into our verification pricing—saving you 75-150x at scale compared to separate OCR services.