Explainers

CAPTCHA Localization: How Language Settings Affect Challenges

The same website shows a reCAPTCHA challenge in English to one visitor and in Japanese to another. A Cloudflare Turnstile widget renders its loading text in the browser's language. Some sites serve completely different CAPTCHA types based on the visitor's detected region. Understanding how localization affects CAPTCHAs helps you handle them correctly in automation.

What Changes with Locale

CAPTCHA provider What localizes What stays the same
reCAPTCHA UI text, image labels, audio language Sitekey, verification flow, token format
Turnstile Widget text and error messages Sitekey, token format, solve mechanism
hCaptcha Challenge instructions, category labels Sitekey, token format
Image/OCR Character set, language of text Image format, submit/poll flow

How Language Gets Detected

CAPTCHA providers determine language through several signals:

1. Accept-Language Header

Accept-Language: ja-JP,ja;q=0.9,en-US;q=0.8,en;q=0.7

This tells the server: prefer Japanese (Japan), then English (US), then generic English. reCAPTCHA and Turnstile use this to select the UI language.

2. HTML hl Parameter

reCAPTCHA accepts an explicit language parameter when loaded:

<!-- Force English reCAPTCHA -->
<script src="https://www.google.com/recaptcha/api.js?hl=en"></script>

<!-- Force Japanese -->
<script src="https://www.google.com/recaptcha/api.js?hl=ja"></script>

The hl parameter overrides the Accept-Language header. When solving, you don't need to match this — CaptchaAI returns a token regardless of UI language.

3. Geo-IP Location

Some CAPTCHA configurations vary by region:

Signal Effect
IP from China May get GeeTest instead of reCAPTCHA (reCAPTCHA is blocked in China)
IP from EU May see GDPR consent before CAPTCHA
IP from restricted region May get stricter challenges

4. Browser navigator.language

JavaScript-based CAPTCHAs read the browser's language:

navigator.language       // "en-US"
navigator.languages      // ["en-US", "en", "ja"]

In headless browsers, these default to the system locale. Set them explicitly to match your target:

// Playwright
const context = await browser.newContext({
  locale: 'ja-JP',
});

// Puppeteer
const page = await browser.newPage();
await page.setExtraHTTPHeaders({
  'Accept-Language': 'ja-JP,ja;q=0.9',
});

Impact on Solving

Token-Based CAPTCHAs (reCAPTCHA, Turnstile, hCaptcha)

Language settings affect the UI but not the token. CaptchaAI's solving process is language-independent:

  • Submit the sitekey and page URL
  • CaptchaAI returns a valid token
  • The token works regardless of what language the CAPTCHA widget displays

No language parameter needed when calling CaptchaAI for token-based CAPTCHAs.

Image CAPTCHAs

Language directly affects the characters in the image:

Site language CAPTCHA content CaptchaAI language param
English "Enter the text: XKCD42" 0 (default/Latin)
Russian "Введите текст: ШКАФ" 1 (Cyrillic) or 2
Chinese "请输入验证码: 汉字" 2 (non-Latin)
Arabic "أدخل النص: عربي" 2 (non-Latin)
Japanese "文字を入力: ひらがな" 2 (non-Latin)

Audio CAPTCHAs

reCAPTCHA audio challenges are spoken in the language matching the hl parameter or Accept-Language header. CaptchaAI handles these through its standard reCAPTCHA solving flow — the solving method doesn't depend on audio language.

Common Localization Issues

Mismatched Language Between Scraper and Target

If your scraper sends Accept-Language: en-US to a Japanese site, the CAPTCHA may render in English — which is fine for token-based CAPTCHAs but may cause issues if the site validates language consistency.

Regional CAPTCHA Provider Differences

Some countries use different CAPTCHA providers:

Region Typical providers
Western markets reCAPTCHA, Turnstile, hCaptcha
China GeeTest, Tencent CAPTCHA, custom image
Russia/CIS Custom image CAPTCHAs, reCAPTCHA
South Korea Custom sliders, image CAPTCHAs

Troubleshooting

Issue Cause Fix
reCAPTCHA shows different language than expected hl parameter in script tag vs Accept-Language mismatch Token is language-independent — doesn't affect solving
Image CAPTCHA wrong characters recognized Language param doesn't match CAPTCHA script Set language=2 for non-Latin CAPTCHAs
Site serves different CAPTCHA type by region Geo-IP-based provider selection Use proxy matching the target region
Headless browser shows wrong locale Default system locale used Set locale explicitly in browser context
Audio CAPTCHA in unexpected language hl parameter overrides header Doesn't affect CaptchaAI token-based solving

FAQ

Does CaptchaAI need to know the CAPTCHA's display language?

For token-based CAPTCHAs (reCAPTCHA, Turnstile, hCaptcha), no. The solving process is language-independent. For Image/OCR CAPTCHAs, yes — set the language parameter to match the character set displayed in the image.

Should I match my Accept-Language header to the target site?

It's good practice for consistency. Some sites check for language mismatches between headers and other signals. Set your Accept-Language header to match the site's primary language to minimize detection risk.

Can the same sitekey show different CAPTCHA difficulty by locale?

Yes. CAPTCHA providers may adjust difficulty based on regional risk scores. Traffic from certain regions may face harder challenges. This doesn't affect CaptchaAI's solving — the API handles challenges of any difficulty.

Next Steps

Handle CAPTCHAs in any locale — get your CaptchaAI API key and configure language settings correctly.

Related guides:

Discussions (0)

No comments yet.