Use Cases

Cyrillic Text CAPTCHA Solving with CaptchaAI

Russian, Ukrainian, Bulgarian, and Serbian websites use Cyrillic text CAPTCHAs that look deceptively similar to Latin — characters like А, В, С, Е, Н, О appear identical to their Latin counterparts but are completely different Unicode codepoints. This creates unique recognition and submission challenges that standard Latin OCR misses.

Cyrillic vs. Latin Confusable Characters

Looks like Latin Cyrillic Unicode
A A (U+0041) А (U+0410) Different codepoints
B B (U+0042) В (U+0412) Cyrillic is "Ve"
C C (U+0043) С (U+0421) Cyrillic is "Es"
E E (U+0045) Е (U+0415) Different encoding
H H (U+0048) Н (U+041D) Cyrillic is "En"
O O (U+004F) О (U+041E) Different codepoints
P P (U+0050) Р (U+0420) Cyrillic is "Er"

Submitting the wrong codepoint causes form validation to reject correct-looking text.

Python: Cyrillic Image CAPTCHA

import requests
import base64
import time

API_KEY = "YOUR_API_KEY"
SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"


def solve_cyrillic_captcha(image_path: str) -> str:
    """Solve a Cyrillic text image CAPTCHA."""
    with open(image_path, "rb") as f:
        image_b64 = base64.b64encode(f.read()).decode()

    resp = requests.post(SUBMIT_URL, data={
        "key": API_KEY,
        "method": "base64",
        "body": image_b64,
        "language": 2,          # Non-Latin character support
        "json": 1,
    }, timeout=30).json()

    if resp.get("status") != 1:
        raise RuntimeError(f"Submit: {resp.get('request')}")

    task_id = resp["request"]
    for _ in range(24):
        time.sleep(5)
        poll = requests.get(RESULT_URL, params={
            "key": API_KEY, "action": "get", "id": task_id, "json": 1,
        }, timeout=15).json()

        if poll.get("request") == "CAPCHA_NOT_READY":
            continue
        if poll.get("status") == 1:
            return poll["request"]
        raise RuntimeError(f"Solve: {poll.get('request')}")

    raise RuntimeError("Timeout")


def solve_cyrillic_from_session(session: requests.Session,
                                 captcha_url: str) -> str:
    """Solve a Cyrillic CAPTCHA within a session context."""
    resp = session.get(captcha_url, timeout=15)
    image_b64 = base64.b64encode(resp.content).decode()

    submit = requests.post(SUBMIT_URL, data={
        "key": API_KEY,
        "method": "base64",
        "body": image_b64,
        "language": 2,
        "json": 1,
    }, timeout=30).json()

    if submit.get("status") != 1:
        raise RuntimeError(f"Submit: {submit.get('request')}")

    task_id = submit["request"]
    for _ in range(24):
        time.sleep(5)
        poll = requests.get(RESULT_URL, params={
            "key": API_KEY, "action": "get", "id": task_id, "json": 1,
        }, timeout=15).json()

        if poll.get("request") == "CAPCHA_NOT_READY":
            continue
        if poll.get("status") == 1:
            return poll["request"]
        raise RuntimeError(f"Solve: {poll.get('request')}")

    raise RuntimeError("Timeout")


def verify_cyrillic(text: str) -> bool:
    """Verify that solved text contains Cyrillic characters."""
    return any('\u0400' <= ch <= '\u04FF' for ch in text)


# --- Russian website form flow ---

def solve_russian_form(form_url: str, captcha_url: str,
                       form_data: dict) -> requests.Response:
    """Complete a Russian website form with CAPTCHA."""
    session = requests.Session()
    session.headers.update({
        "Accept-Language": "ru-RU,ru;q=0.9",
    })

    # Establish session
    session.get(form_url, timeout=15)

    # Solve CAPTCHA
    captcha_text = solve_cyrillic_from_session(session, captcha_url)
    print(f"Cyrillic CAPTCHA: {captcha_text}")

    if verify_cyrillic(captcha_text):
        print("Confirmed: contains Cyrillic characters")

    form_data["captcha"] = captcha_text
    return session.post(form_url, data=form_data, timeout=30)


# --- Usage ---

text = solve_cyrillic_captcha("russian_captcha.png")
print(f"Solved: {text}")
print(f"Is Cyrillic: {verify_cyrillic(text)}")
print(f"Unicode codepoints: {[hex(ord(c)) for c in text]}")

JavaScript: Cyrillic CAPTCHA Handling

const API_KEY = "YOUR_API_KEY";
const SUBMIT_URL = "https://ocr.captchaai.com/in.php";
const RESULT_URL = "https://ocr.captchaai.com/res.php";
const fs = require("fs");

async function solveCyrillicCaptcha(imagePath) {
  const imageB64 = fs.readFileSync(imagePath, "base64");

  const body = new URLSearchParams({
    key: API_KEY,
    method: "base64",
    body: imageB64,
    language: "2",
    json: "1",
  });

  const resp = await (await fetch(SUBMIT_URL, { method: "POST", body })).json();
  if (resp.status !== 1) throw new Error(`Submit: ${resp.request}`);

  const taskId = resp.request;
  for (let i = 0; i < 24; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    const url = `${RESULT_URL}?key=${API_KEY}&action=get&id=${taskId}&json=1`;
    const poll = await (await fetch(url)).json();
    if (poll.request === "CAPCHA_NOT_READY") continue;
    if (poll.status === 1) return poll.request;
    throw new Error(`Solve: ${poll.request}`);
  }
  throw new Error("Timeout");
}

function isCyrillic(text) {
  return /[\u0400-\u04FF]/.test(text);
}

function showCodepoints(text) {
  return [...text].map((ch) => `${ch}=U+${ch.codePointAt(0).toString(16).padStart(4, "0")}`);
}

// Usage
const text = await solveCyrillicCaptcha("russian_captcha.png");
console.log(`Solved: ${text}`);
console.log(`Is Cyrillic: ${isCyrillic(text)}`);
console.log(`Codepoints: ${showCodepoints(text).join(", ")}`);

Common Cyrillic CAPTCHA Patterns

Pattern Description Example
Pure Cyrillic word Random Russian word ШКАФ, ПИРОГ
Mixed Latin + Cyrillic Both scripts in one image ABСDе (A,B,D Latin; С,е Cyrillic)
Cyrillic digits spelled out Number words ПЯТЬ (five), ТРИ (three)
Math in Russian Arithmetic in words два плюс три = ?
Distorted Cyrillic Warped Russian text Standard OCR challenge with Cyrillic

Troubleshooting

Issue Cause Fix
Form rejects correct-looking text Latin/Cyrillic homoglyph mismatch Check Unicode codepoints — А (U+0410) ≠ A (U+0041)
Characters garbled on display Wrong encoding Use UTF-8 throughout; set response.encoding = 'utf-8'
Mixed script text partially wrong OCR confused Latin and Cyrillic CaptchaAI with language=2 distinguishes correctly
Ukrainian-specific characters missing ґ, є, і, ї not recognized These are supported with language=2
CAPTCHA case sensitivity Uppercase/lowercase matters Submit exactly as returned by CaptchaAI

FAQ

How does CaptchaAI distinguish Cyrillic В from Latin B?

CaptchaAI's OCR models are trained on context and glyph features. When language=2 is set, the solver uses Cyrillic-aware models that return proper Unicode codepoints. The returned text will use Cyrillic characters (U+0400–U+04FF) for Russian text.

Does it handle Ukrainian-specific characters?

Yes. Ukrainian uses characters not present in Russian — ґ (U+0491), є (U+0454), і (U+0456), ї (U+0457). CaptchaAI recognizes these with language=2. The solver handles all Cyrillic scripts including Russian, Ukrainian, Bulgarian, and Serbian.

What if the CAPTCHA mixes Cyrillic and Latin?

Some CAPTCHAs intentionally mix scripts to create ambiguity. CaptchaAI returns the text with correct Unicode codepoints for each character. Verify using the verify_cyrillic() function or by inspecting codepoints.

Next Steps

Solve Cyrillic CAPTCHAs on Russian and Slavic websites — get your CaptchaAI API key.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.