Explainers

IP Reputation and CAPTCHA Solving: Best Practices

Your IP address reputation directly affects CAPTCHA difficulty and success rates. Understanding IP scoring helps you maintain high solve rates with CaptchaAI.


How IP Reputation Affects CAPTCHAs

IP Quality CAPTCHA Behavior reCAPTCHA v3 Score
Clean residential Few CAPTCHAs, easy challenges 0.7-0.9
Used residential Moderate CAPTCHAs 0.5-0.7
Clean datacenter Frequent CAPTCHAs 0.3-0.5
Flagged datacenter Hardest CAPTCHAs, may block 0.1-0.3
Blacklisted Instant block 0.1

IP Quality Checker

import requests


def check_ip_quality(proxy=None):
    """Basic IP quality assessment."""
    proxies = {"https": proxy} if proxy else None

    info = {}

    # Get IP info
    try:
        resp = requests.get("https://httpbin.org/ip", proxies=proxies, timeout=10)
        info["ip"] = resp.json()["origin"]
    except Exception as e:
        info["error"] = str(e)
        return info

    # Check if datacenter IP (basic heuristic)
    try:
        resp = requests.get(
            f"https://ipinfo.io/{info['ip']}/json",
            timeout=10,
        )
        data = resp.json()
        info["org"] = data.get("org", "")
        info["country"] = data.get("country", "")
        info["city"] = data.get("city", "")

        # Datacenter indicators
        dc_keywords = ["hosting", "cloud", "server", "datacenter", "amazon",
                       "google", "microsoft", "digital ocean", "ovh", "hetzner"]
        org_lower = info["org"].lower()
        info["likely_datacenter"] = any(kw in org_lower for kw in dc_keywords)
    except Exception:
        info["likely_datacenter"] = None

    return info


# Check your current IP
result = check_ip_quality()
print(f"IP: {result.get('ip')}")
print(f"Org: {result.get('org')}")
print(f"Datacenter: {result.get('likely_datacenter')}")

Proxy Rotation Strategy

import random
import time
from collections import defaultdict


class ProxyRotator:
    """Rotate proxies to distribute IP reputation impact."""

    def __init__(self, proxies):
        self.proxies = list(proxies)
        self.usage_count = defaultdict(int)
        self.last_used = defaultdict(float)
        self.failures = defaultdict(int)
        self._index = 0

    def get_proxy(self, cooldown=30):
        """Get next available proxy with cooldown."""
        now = time.time()
        available = [
            p for p in self.proxies
            if now - self.last_used[p] >= cooldown
            and self.failures[p] < 5
        ]

        if not available:
            # All on cooldown — use least recently used
            available = sorted(
                self.proxies,
                key=lambda p: self.last_used[p],
            )

        proxy = available[0]
        self.last_used[proxy] = now
        self.usage_count[proxy] += 1
        return proxy

    def report_success(self, proxy):
        """Mark proxy as working."""
        self.failures[proxy] = max(0, self.failures[proxy] - 1)

    def report_failure(self, proxy):
        """Mark proxy as failed."""
        self.failures[proxy] += 1

    def get_stats(self):
        """Get proxy usage statistics."""
        return {
            proxy: {
                "uses": self.usage_count[proxy],
                "failures": self.failures[proxy],
            }
            for proxy in self.proxies
        }


# Usage
rotator = ProxyRotator([
    "http://user:pass@proxy1.example.com:8080",
    "http://user:pass@proxy2.example.com:8080",
    "http://user:pass@proxy3.example.com:8080",
])

proxy = rotator.get_proxy(cooldown=30)

IP Warmup Pattern

New IPs need gradual activity before heavy use:

class IPWarmer:
    """Gradually increase request rate for new IPs."""

    PHASES = [
        {"duration": 300, "requests_per_min": 1, "label": "Warm-up"},
        {"duration": 600, "requests_per_min": 3, "label": "Light"},
        {"duration": 900, "requests_per_min": 6, "label": "Moderate"},
        {"duration": 0,   "requests_per_min": 10, "label": "Normal"},
    ]

    def __init__(self):
        self.start_time = time.time()

    def get_current_rate(self):
        """Get allowed requests per minute for current phase."""
        elapsed = time.time() - self.start_time
        cumulative = 0

        for phase in self.PHASES:
            cumulative += phase["duration"]
            if phase["duration"] == 0 or elapsed < cumulative:
                return phase["requests_per_min"], phase["label"]

        return self.PHASES[-1]["requests_per_min"], self.PHASES[-1]["label"]

    def get_delay(self):
        """Get delay between requests for current phase."""
        rate, label = self.get_current_rate()
        return 60.0 / rate


warmer = IPWarmer()
# First 5 min: 1 req/min, next 10 min: 3 req/min, etc.
delay = warmer.get_delay()

Proxy Types and When to Use Each

Proxy Type Cost IP Quality Best For
Residential High Excellent Sites with strict detection
ISP/Static residential Medium-High Very good Consistent sessions
Mobile Highest Best Hardest sites
Datacenter Low Fair Low-protection sites
Rotating residential Medium Good High-volume, varied sites

CaptchaAI Proxy Parameters

When sending proxy to CaptchaAI, it uses your proxy to solve:

import requests


def solve_with_proxy(api_key, sitekey, pageurl, proxy_url):
    """Solve using your proxy for better IP matching."""
    # Parse proxy: http://user:pass@host:port
    resp = requests.post("https://ocr.captchaai.com/in.php", data={
        "key": api_key,
        "method": "userrecaptcha",
        "googlekey": sitekey,
        "pageurl": pageurl,
        "proxy": proxy_url,        # Your proxy
        "proxytype": "HTTP",        # HTTP, HTTPS, SOCKS4, SOCKS5
        "json": 1,
    }, timeout=30)

    return resp.json()

FAQ

Does CaptchaAI need my proxy?

CaptchaAI can solve CAPTCHAs without your proxy using its own infrastructure. Sending your proxy is optional and helps when the site validates that IP matches between CAPTCHA solve and form submission.

Should I use residential or datacenter proxies?

For sites with strong anti-bot detection (Cloudflare, Akamai), residential proxies produce better results. For simpler sites, datacenter proxies work fine and cost less.

How many CAPTCHAs can I solve per IP?

There's no fixed limit. It depends on the target site's detection thresholds. A safe starting point is 10-20 solves per IP per hour for residential, 5-10 for datacenter.



Optimize your IP strategy — solve with CaptchaAI.

Discussions (0)

No comments yet.