Explainers

Rotating Residential Proxies: Best Practices for CAPTCHA Solving

Rotating residential proxies assign a new real-user IP for each request (or per session). For CAPTCHA-heavy workflows, the right rotation strategy can cut challenge rates by 50-80% — and CaptchaAI handles the rest.


Why Residential IPs Reduce CAPTCHAs

Proxy Type IP Source CAPTCHA Rate Why
Datacenter Cloud/hosting provider High (30-70%) Known IP ranges flagged
Residential Real home ISPs Low (5-15%) Looks like real user
Mobile Cellular carriers Very low (1-5%) Shared by thousands of users
ISP Static residential-grade Low (5-10%) Combines speed + trust

CAPTCHA systems (especially reCAPTCHA) check IP reputation databases. Residential IPs from real ISPs have clean histories because they belong to real subscribers.


Rotation Strategies

1. Per-Request Rotation (Default)

import requests

proxies = {
    "http": "http://user:pass@gateway.proxy.com:7777",
    "https": "http://user:pass@gateway.proxy.com:7777",
}

# Each request gets a new IP
for url in urls:
    resp = requests.get(url, proxies=proxies, timeout=30)
    # New IP after each request

Best for: Scraping many pages across different sites. CAPTCHA risk: Low per-request, but can't maintain session.

2. Sticky Session Rotation

import random
import string

def get_sticky_session(duration_min=10):
    session_id = "".join(random.choices(string.ascii_lowercase, k=8))
    return {
        "http": f"http://user-session-{session_id}:pass@gateway.proxy.com:7777",
        "https": f"http://user-session-{session_id}:pass@gateway.proxy.com:7777",
    }

# Same IP for the whole CAPTCHA workflow
session_proxy = get_sticky_session(duration_min=10)

# Step 1: Load page
resp = requests.get("https://target.com/form", proxies=session_proxy)

# Step 2: Solve CAPTCHA (same IP)
token = solve_captcha("https://target.com/form", sitekey)

# Step 3: Submit (same IP — required for token validity)
resp = requests.post("https://target.com/submit", proxies=session_proxy, data={
    "g-recaptcha-response": token,
})

Best for: CAPTCHA-protected forms and multi-step workflows. CAPTCHA risk: Very low — consistent IP builds trust.

3. Per-Domain Rotation

domain_sessions = {}

def get_domain_proxy(domain):
    """Same IP per domain, different IP per domain."""
    if domain not in domain_sessions:
        session_id = "".join(random.choices(string.ascii_lowercase, k=8))
        domain_sessions[domain] = get_sticky_session()
    return domain_sessions[domain]

# site-a.com always uses the same IP
proxy_a = get_domain_proxy("site-a.com")
# site-b.com gets a different IP
proxy_b = get_domain_proxy("site-b.com")

Best Practices

1. Match Geo to Target

# Target site is US-based? Use US residential IP
us_proxy = "http://user-country-us:pass@gateway.proxy.com:7777"

# Target is Germany? Use DE residential IP
de_proxy = "http://user-country-de:pass@gateway.proxy.com:7777"

Geo-mismatched IPs trigger more CAPTCHAs (e.g., accessing a US site from a Nigerian IP).

2. Set Realistic Request Rates

import time
import random

def human_pace_requests(urls, proxy):
    results = []
    for url in urls:
        resp = requests.get(url, proxies=proxy, timeout=30)
        results.append(resp)

        # Random delay between requests
        delay = random.uniform(2, 8)
        time.sleep(delay)

    return results
Rate CAPTCHA Risk
1 req/sec High (bot pattern)
1 req/5-10 sec Medium
1 req/10-30 sec Low (human-like)
Random 3-15 sec Very low

3. Use Proper Headers

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                  "AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/126.0.0.0 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "DNT": "1",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1",
}

resp = requests.get(url, proxies=proxy, headers=headers)

4. Handle IP Bans Gracefully

def fetch_with_retry(url, max_retries=3):
    for attempt in range(max_retries):
        # New proxy for each retry
        proxy = get_sticky_session()

        try:
            resp = requests.get(url, proxies=proxy, timeout=30)

            if resp.status_code == 403:
                print(f"IP banned, rotating... (attempt {attempt + 1})")
                time.sleep(5)
                continue

            if resp.status_code == 429:
                print(f"Rate limited, backing off...")
                time.sleep(30)
                continue

            return resp

        except requests.exceptions.ProxyError:
            print(f"Proxy failed, rotating...")
            continue

    raise Exception(f"Failed after {max_retries} retries")

5. Warm Up New IPs

def warm_up_proxy(proxy):
    """Visit neutral sites before target to build session trust."""
    warm_urls = [
        "https://www.google.com",
        "https://www.wikipedia.org",
    ]
    for url in warm_urls:
        try:
            requests.get(url, proxies=proxy, timeout=15)
            time.sleep(random.uniform(2, 5))
        except Exception:
            pass

CaptchaAI Integration Pattern

import requests
import time
import re
import random
import string

CAPTCHAAI_KEY = "YOUR_API_KEY"
CAPTCHAAI_URL = "https://ocr.captchaai.com"


def solve_recaptcha(site_url, sitekey):
    resp = requests.post(f"{CAPTCHAAI_URL}/in.php", data={
        "key": CAPTCHAAI_KEY,
        "method": "userrecaptcha",
        "googlekey": sitekey,
        "pageurl": site_url,
        "json": 1,
    })
    task_id = resp.json()["request"]

    for _ in range(60):
        time.sleep(5)
        resp = requests.get(f"{CAPTCHAAI_URL}/res.php", params={
            "key": CAPTCHAAI_KEY, "action": "get",
            "id": task_id, "json": 1,
        })
        data = resp.json()
        if data["request"] != "CAPCHA_NOT_READY":
            return data["request"]

    raise TimeoutError("Timeout")


def scrape_url(url, proxy_config):
    """Complete scrape with proxy + CAPTCHA solving."""
    session_id = "".join(random.choices(string.ascii_lowercase, k=8))
    proxy = {
        "http": f"http://user-session-{session_id}:{proxy_config['pass']}@{proxy_config['host']}",
        "https": f"http://user-session-{session_id}:{proxy_config['pass']}@{proxy_config['host']}",
    }

    resp = requests.get(url, proxies=proxy, timeout=30)

    match = re.search(r'data-sitekey="([^"]+)"', resp.text)
    if match:
        token = solve_recaptcha(url, match.group(1))
        resp = requests.post(
            url.replace("/form", "/submit"),
            proxies=proxy,
            data={"g-recaptcha-response": token},
        )

    return resp.text

Cost Optimization

Strategy Impact
Use residential only for CAPTCHA-heavy sites Save 50-70% vs. all-residential
Reduce CAPTCHA frequency with session warming Fewer CaptchaAI calls
Batch pages before CAPTCHA-protected ones Fewer sticky sessions needed
Monitor IP quality and drop bad pools Higher first-try success

Troubleshooting

Issue Cause Fix
CAPTCHA on every page Bad IP pool Switch proxy provider or zone
Token rejected IP rotated between solve and submit Use sticky sessions
407 errors Auth format wrong Check provider's exact username format
Slow responses Overloaded gateway Try off-peak hours or different endpoint
Session expires mid-workflow Sticky TTL too short Request longer session duration

FAQ

How many residential IPs do I need?

For moderate scraping (1000 pages/day), any major provider's pool is sufficient. The rotation gateway handles IP management automatically.

Do I rotate IPs during a CAPTCHA solve?

Never. Keep the same IP from page load through token submission. Rotate after the task completes.

Is rotating faster than sticky?

Per-request rotation has lower latency (no session overhead). Sticky adds a few ms per request. For CAPTCHA workflows, the difference is negligible.



Optimize your proxy rotation for fewer CAPTCHAs — get your CaptchaAI key to solve the rest.

Discussions (0)

No comments yet.