Use Cases

Solving CAPTCHAs on Chinese Websites with CaptchaAI

Chinese websites use CAPTCHA types rarely seen in Western markets — Chinese character image CAPTCHAs, arithmetic puzzles in Mandarin, GeeTest slide verifications, and proprietary providers like Tencent CAPTCHA. If you're collecting data from Baidu, Taobao, government portals, or academic databases, you need to handle these region-specific challenges.

Common CAPTCHA Types on Chinese Websites

CAPTCHA type Where used CaptchaAI solver
Chinese character image Government portals, academic databases Image/OCR with language=chi_sim
Arithmetic in Chinese Registration forms Image/OCR
GeeTest v3 slide puzzle Baidu, Bilibili, many major platforms GeeTest v3
Click-on-character "Click the Chinese character in order" Image/OCR (coordinate mode)
reCAPTCHA v2 International Chinese sites reCAPTCHA v2

Python: Chinese Character Image CAPTCHA

Chinese image CAPTCHAs display characters like 请输入验证码 (please enter verification code) and require OCR of Chinese text.

import requests
import base64
import time

API_KEY = "YOUR_API_KEY"
SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"


def solve_chinese_image_captcha(image_path: str) -> str:
    """Solve a Chinese character image CAPTCHA."""
    with open(image_path, "rb") as f:
        image_b64 = base64.b64encode(f.read()).decode()

    resp = requests.post(SUBMIT_URL, data={
        "key": API_KEY,
        "method": "base64",
        "body": image_b64,
        "language": 2,          # 2 = Chinese characters supported
        "json": 1,
    }, timeout=30).json()

    if resp.get("status") != 1:
        raise RuntimeError(f"Submit failed: {resp.get('request')}")

    task_id = resp["request"]
    start = time.monotonic()

    while time.monotonic() - start < 120:
        time.sleep(5)
        poll = requests.get(RESULT_URL, params={
            "key": API_KEY, "action": "get",
            "id": task_id, "json": 1,
        }, timeout=15).json()

        if poll.get("request") == "CAPCHA_NOT_READY":
            continue
        if poll.get("status") == 1:
            return poll["request"]
        raise RuntimeError(f"Solve failed: {poll.get('request')}")

    raise RuntimeError("Timeout")


def solve_chinese_captcha_from_url(captcha_url: str, cookies: dict = None) -> str:
    """Download and solve a Chinese CAPTCHA from a URL."""
    session = requests.Session()
    if cookies:
        session.cookies.update(cookies)

    resp = session.get(captcha_url, timeout=15)
    image_b64 = base64.b64encode(resp.content).decode()

    submit = requests.post(SUBMIT_URL, data={
        "key": API_KEY,
        "method": "base64",
        "body": image_b64,
        "language": 2,
        "json": 1,
    }, timeout=30).json()

    if submit.get("status") != 1:
        raise RuntimeError(f"Submit: {submit.get('request')}")

    task_id = submit["request"]
    for _ in range(24):
        time.sleep(5)
        poll = requests.get(RESULT_URL, params={
            "key": API_KEY, "action": "get", "id": task_id, "json": 1,
        }, timeout=15).json()

        if poll.get("request") == "CAPCHA_NOT_READY":
            continue
        if poll.get("status") == 1:
            return poll["request"]
        raise RuntimeError(f"Solve: {poll.get('request')}")

    raise RuntimeError("Timeout")


# --- GeeTest on Chinese platforms ---

def solve_geetest_chinese(gt: str, challenge: str, pageurl: str) -> dict:
    """Solve GeeTest v3 commonly found on Baidu, Bilibili, etc."""
    resp = requests.post(SUBMIT_URL, data={
        "key": API_KEY,
        "method": "geetest",
        "gt": gt,
        "challenge": challenge,
        "pageurl": pageurl,
        "json": 1,
    }, timeout=30).json()

    if resp.get("status") != 1:
        raise RuntimeError(f"Submit: {resp.get('request')}")

    task_id = resp["request"]
    for _ in range(36):
        time.sleep(5)
        poll = requests.get(RESULT_URL, params={
            "key": API_KEY, "action": "get", "id": task_id, "json": 1,
        }, timeout=15).json()

        if poll.get("request") == "CAPCHA_NOT_READY":
            continue
        if poll.get("status") == 1:
            # GeeTest returns challenge, validate, seccode
            return poll["request"]
        raise RuntimeError(f"Solve: {poll.get('request')}")

    raise RuntimeError("Timeout")


# Usage — Chinese government portal
text = solve_chinese_image_captcha("chinese_captcha.png")
print(f"Chinese CAPTCHA text: {text}")

# GeeTest on a Chinese platform
geetest_result = solve_geetest_chinese(
    gt="b46d1900d0a894591f1561f8c35670a7",
    challenge="dynamic_challenge_string",
    pageurl="https://www.example.cn/login",
)

JavaScript: Chinese Website CAPTCHA Flow

const API_KEY = "YOUR_API_KEY";
const SUBMIT_URL = "https://ocr.captchaai.com/in.php";
const RESULT_URL = "https://ocr.captchaai.com/res.php";
const fs = require("fs");

async function solveChineseImageCaptcha(imagePath) {
  const imageB64 = fs.readFileSync(imagePath, "base64");

  const body = new URLSearchParams({
    key: API_KEY,
    method: "base64",
    body: imageB64,
    language: "2",
    json: "1",
  });

  const resp = await (await fetch(SUBMIT_URL, { method: "POST", body })).json();
  if (resp.status !== 1) throw new Error(`Submit: ${resp.request}`);

  const taskId = resp.request;
  for (let i = 0; i < 24; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    const url = `${RESULT_URL}?key=${API_KEY}&action=get&id=${taskId}&json=1`;
    const poll = await (await fetch(url)).json();
    if (poll.request === "CAPCHA_NOT_READY") continue;
    if (poll.status === 1) return poll.request;
    throw new Error(`Solve: ${poll.request}`);
  }
  throw new Error("Timeout");
}

async function solveGeeTest(gt, challenge, pageurl) {
  const body = new URLSearchParams({
    key: API_KEY,
    method: "geetest",
    gt,
    challenge,
    pageurl,
    json: "1",
  });

  const resp = await (await fetch(SUBMIT_URL, { method: "POST", body })).json();
  if (resp.status !== 1) throw new Error(`Submit: ${resp.request}`);

  const taskId = resp.request;
  for (let i = 0; i < 36; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    const url = `${RESULT_URL}?key=${API_KEY}&action=get&id=${taskId}&json=1`;
    const poll = await (await fetch(url)).json();
    if (poll.request === "CAPCHA_NOT_READY") continue;
    if (poll.status === 1) return poll.request;
    throw new Error(`Solve: ${poll.request}`);
  }
  throw new Error("Timeout");
}

// Usage
const text = await solveChineseImageCaptcha("chinese_captcha.png");
console.log(`Chinese text: ${text}`);

Tips for Chinese Website Scraping

Challenge Solution
Characters render incorrectly Ensure UTF-8 encoding in requests and responses
Session cookies required Fetch the CAPTCHA image within the same session
GeeTest parameters in JavaScript Extract gt and challenge from page scripts or API calls
Rate limiting by Chinese CDNs Use proxies with Chinese IP addresses for better reliability
CAPTCHA refreshes on each load Download the image once, solve, then submit — don't reload

Troubleshooting

Issue Cause Fix
Chinese characters garbled in result Encoding mismatch Ensure response is decoded as UTF-8
GeeTest challenge expired Challenge has short TTL Extract and submit within seconds of loading
Image CAPTCHA returns wrong text Mix of Chinese and Latin characters Set language=2 for Chinese character recognition
Session rejected after solve Cookies not maintained Use the same session for CAPTCHA fetch and form submit
Solve rate low on government sites Complex character rendering Higher-resolution CAPTCHA images improve accuracy

FAQ

Does CaptchaAI support simplified and traditional Chinese?

Yes. The Image/OCR solver handles both simplified (简体) and traditional (繁體) Chinese characters. Set language=2 to enable Chinese character recognition. CaptchaAI recognizes over 27,500 image CAPTCHA types across all languages.

How do I handle GeeTest on Chinese platforms like Bilibili?

Extract the gt parameter (static per site) and the challenge parameter (dynamic per session) from the page source or an API endpoint. Submit both to CaptchaAI's GeeTest solver. The response includes challenge, validate, and seccode values to submit with your request.

Are proxies required for Chinese websites?

Many Chinese websites perform IP-based geo-checks. Using proxies with Chinese IP addresses improves success rates. When passing a proxy to CaptchaAI, use residential Chinese proxies for the best results.

Next Steps

Start solving CAPTCHAs on Chinese websites — get your CaptchaAI API key and handle any Chinese CAPTCHA type.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.