Reference

Proxy Setup Best Practices for CaptchaAI

Proxies are required for Cloudflare Challenge solving and recommended for high-volume CAPTCHA operations. This guide covers configuration, proxy types, and best practices.

When You Need Proxies

CAPTCHA Type Proxy Required? Reason
Cloudflare Challenge ✅ Required cf_clearance cookie is IP-bound
reCAPTCHA v2/v3 ⚠️ Recommended Reduces IP risk scoring
Cloudflare Turnstile ❌ Optional Token is not IP-bound
Image/OCR ❌ Not needed No browser context
GeeTest ⚠️ Recommended Some implementations check IP

Proxy Parameters

CaptchaAI accepts proxies via proxy and proxytype parameters:

proxy=username:password@host:port
proxytype=HTTP

Supported Proxy Types

Type Value Use Case
HTTP HTTP Most common, works everywhere
HTTPS HTTPS Encrypted proxy connection
SOCKS4 SOCKS4 Legacy proxy support
SOCKS5 SOCKS5 Advanced routing, UDP support

Configuration Examples

Cloudflare Challenge (Proxy Required)

import requests

API_KEY = "YOUR_API_KEY"

resp = requests.get("https://ocr.captchaai.com/in.php", params={
    "key": API_KEY,
    "method": "cloudflare_challenge",
    "pageurl": "https://example.com",
    "proxy": "user:pass@proxy.example.com:8080",
    "proxytype": "HTTP",
})
task_id = resp.text.split("|")[1]

# Poll for result
import time
while True:
    time.sleep(5)
    result = requests.get("https://ocr.captchaai.com/res.php", params={
        "key": API_KEY, "action": "get", "id": task_id,
    })
    if result.text.startswith("OK|"):
        # Returns cf_clearance cookie + user_agent
        data = result.text.split("|", 1)[1]
        print(data)
        break

# Use the cf_clearance cookie with the SAME proxy
# The cookie is bound to the proxy IP

reCAPTCHA with Proxy

resp = requests.get("https://ocr.captchaai.com/in.php", params={
    "key": API_KEY,
    "method": "userrecaptcha",
    "googlekey": "6Le-wvkS...",
    "pageurl": "https://example.com",
    "proxy": "user:pass@proxy.example.com:8080",
    "proxytype": "HTTP",
})

Node.js

const resp = await axios.get("https://ocr.captchaai.com/in.php", {
  params: {
    key: API_KEY,
    method: "cloudflare_challenge",
    pageurl: "https://example.com",
    proxy: "user:pass@proxy.example.com:8080",
    proxytype: "HTTP",
  },
});

Proxy Rotation

For high-volume scraping, rotate proxies to avoid IP-based blocking:

import random

PROXIES = [
    "user:pass@us1.proxy.com:8080",
    "user:pass@us2.proxy.com:8080",
    "user:pass@eu1.proxy.com:8080",
    "user:pass@eu2.proxy.com:8080",
]

def solve_with_rotation(site_key, page_url):
    proxy = random.choice(PROXIES)

    resp = requests.get("https://ocr.captchaai.com/in.php", params={
        "key": API_KEY,
        "method": "userrecaptcha",
        "googlekey": site_key,
        "pageurl": page_url,
        "proxy": proxy,
        "proxytype": "HTTP",
    })
    return resp.text

Sticky Sessions for Cloudflare

Cloudflare cookies are IP-bound. Use the same proxy for solving and subsequent requests:

def solve_cloudflare_and_scrape(target_url, proxy_str):
    """Solve Cloudflare challenge and use the same proxy for scraping."""

    # Solve with proxy
    resp = requests.get("https://ocr.captchaai.com/in.php", params={
        "key": API_KEY,
        "method": "cloudflare_challenge",
        "pageurl": target_url,
        "proxy": proxy_str,
        "proxytype": "HTTP",
    })
    task_id = resp.text.split("|")[1]

    # Poll for result
    import time
    while True:
        time.sleep(5)
        result = requests.get("https://ocr.captchaai.com/res.php", params={
            "key": API_KEY, "action": "get", "id": task_id,
        })
        if result.text.startswith("OK|"):
            data = result.text.split("|", 1)[1]
            break

    # Parse cf_clearance and user_agent
    parts = dict(item.split("=", 1) for item in data.split(";"))
    cf_clearance = parts.get("cf_clearance", "")
    user_agent = parts.get("user_agent", "")

    # Use SAME proxy with the cookie
    proxy_url = f"http://{proxy_str}"
    session = requests.Session()
    session.proxies = {"http": proxy_url, "https": proxy_url}
    session.cookies.set("cf_clearance", cf_clearance)
    session.headers["User-Agent"] = user_agent

    return session.get(target_url)

Proxy Quality Checklist

Factor Good Bad
Type Residential, ISP Datacenter (for Cloudflare)
Speed <500ms latency >2s latency
Uptime 99%+ Frequent drops
Location Match target geo Random location
Authentication Username/password IP whitelist (less flexible)

Best Practices

  1. Match proxy location to target site — A US proxy for US sites reduces suspicion
  2. Use residential proxies for Cloudflare — Datacenter IPs are flagged more often
  3. Keep proxy consistent per session — Don't switch proxies mid-session for the same site
  4. Test proxy before submitting — Verify the proxy works before sending to CaptchaAI
  5. Monitor proxy health — Replace dead or slow proxies automatically

Troubleshooting

Error Cause Fix
ERROR_PROXY_NOT_AUTHORIZED Bad proxy credentials Check username/password
ERROR_CAPTCHA_UNSOLVABLE Proxy blocked by target Try different proxy
Token works but page still blocked IP mismatch Use same proxy for requests
Slow solve times with proxy High latency proxy Switch to closer proxy

FAQ

Do I need proxies for reCAPTCHA?

Not required, but recommended for high-volume use. Proxies prevent your IP from being flagged by Google's risk analysis.

Can I use free proxies?

Free proxies are unreliable and often blacklisted. Use a paid proxy service for consistent results.

What's the best proxy type?

Residential proxies offer the best success rates. ISP proxies are a good balance of speed and reliability.

Discussions (0)

No comments yet.