Reference

Optimizing CaptchaAI Speed and Cost

Every second matters in CAPTCHA solving. This guide covers proven strategies to reduce solve times, lower costs, and maximize throughput with CaptchaAI.

Speed Optimization

1. Start Solving Before You Need the Token

Submit the CAPTCHA as soon as you detect it, then continue scraping other pages while it solves:

import asyncio
import aiohttp

async def prefetch_solve(solver, session, site_key, page_url):
    """Start solving in advance, return a future."""
    return asyncio.create_task(
        solver.solve(session, {
            "method": "userrecaptcha",
            "googlekey": site_key,
            "pageurl": page_url,
        })
    )

# Start solve immediately
solve_task = await prefetch_solve(solver, session, site_key, url)

# Do other work while CAPTCHA is being solved
other_data = await fetch_other_pages(session)

# Now retrieve the token (already solved or almost done)
token = await solve_task

2. Use the Right Method

Different methods have different solve times:

Method Avg. Solve Time Cost
Image/OCR ~3-5s Lowest
reCAPTCHA v3 ~5-8s Low
Cloudflare Turnstile ~8-12s Medium
reCAPTCHA v2 ~10-15s Medium
reCAPTCHA v2 Enterprise ~12-18s Higher
Cloudflare Challenge ~12-20s Highest

If the site uses reCAPTCHA v3, prefer solving it over v2 — it's faster and cheaper.

3. Reduce Poll Frequency Smartly

Don't poll every 1 second. Start at 5 seconds, then adjust:

async def smart_poll(session, solver, task_id):
    """Poll with increasing intervals based on expected solve time."""
    intervals = [5, 5, 5, 10, 10, 15, 15, 30]  # seconds

    for wait in intervals:
        await asyncio.sleep(wait)
        result = await solver.check(session, task_id)
        if result:
            return result

    raise TimeoutError("Solve timed out")

4. Use Callbacks for High Volume

For 100+ CAPTCHAs/hour, use the callback (pingback) mechanism instead of polling:

resp = requests.get("https://ocr.captchaai.com/in.php", params={
    "key": API_KEY,
    "method": "userrecaptcha",
    "googlekey": site_key,
    "pageurl": page_url,
    "pingback": "https://your-server.com/captcha-done",
})

This eliminates all polling requests, reducing API calls by 60-80%.

Cost Optimization

1. Avoid Unnecessary Solves

Check if a CAPTCHA is actually required before solving:

async def scrape_smart(url, session, solver):
    resp = await session.get(url)
    html = await resp.text()

    # Only solve if CAPTCHA is present
    if "g-recaptcha" not in html and "cf-turnstile" not in html:
        return html  # No CAPTCHA needed

    # Solve only when necessary
    token = await solver.solve(...)

2. Cache Tokens When Possible

Some tokens are valid for multiple minutes. Reuse them:

import time

token_cache = {}

def get_or_solve(site_key, page_url, solver, cache_ttl=60):
    cache_key = f"{site_key}:{page_url}"

    if cache_key in token_cache:
        token, timestamp = token_cache[cache_key]
        if time.time() - timestamp < cache_ttl:
            return token

    token = solver.solve({
        "method": "userrecaptcha",
        "googlekey": site_key,
        "pageurl": page_url,
    })

    token_cache[cache_key] = (token, time.time())
    return token

Note: reCAPTCHA tokens typically expire after 120 seconds. Cloudflare Turnstile tokens last shorter. Test the effective lifetime for your target site.

3. Report Bad Solves

Report incorrect results to get credits back and improve quality:

def report_bad(task_id):
    requests.get("https://ocr.captchaai.com/res.php", params={
        "key": API_KEY,
        "action": "reportbad",
        "id": task_id,
    })

4. Monitor Your Spending

Check your balance regularly and set alerts:

def check_balance():
    resp = requests.get("https://ocr.captchaai.com/res.php", params={
        "key": API_KEY,
        "action": "getbalance",
    })
    balance = float(resp.text)

    if balance < 1.0:
        send_alert(f"Low CaptchaAI balance: ${balance:.2f}")

    return balance

5. Use Image OCR for Simple CAPTCHAs

Image CAPTCHAs cost less than token-based ones. If the site uses simple text CAPTCHAs, solve them as images instead of using reCAPTCHA methods.

Architecture Patterns

Producer-Consumer Queue

import asyncio
from asyncio import Queue

async def captcha_worker(queue, solver, session, results):
    """Worker that processes CAPTCHA tasks from a queue."""
    while True:
        task = await queue.get()
        try:
            token = await solver.solve(session, task["params"])
            results[task["id"]] = token
        except Exception as e:
            results[task["id"]] = None
        finally:
            queue.task_done()

async def run_pipeline(tasks, num_workers=5):
    solver = AsyncCaptchaAI(os.environ["CAPTCHAAI_API_KEY"])
    queue = Queue()
    results = {}

    async with aiohttp.ClientSession() as session:
        # Start workers
        workers = [
            asyncio.create_task(captcha_worker(queue, solver, session, results))
            for _ in range(num_workers)
        ]

        # Add tasks
        for task in tasks:
            await queue.put(task)

        # Wait for completion
        await queue.join()

        # Cancel workers
        for w in workers:
            w.cancel()

    return results

Troubleshooting

Symptom Cause Fix
Solve times over 30s Server load or complex CAPTCHA Retry; check CAPTCHA type
High cost per page Solving CAPTCHAs unnecessarily Cache tokens; check before solving
Tokens rejected Token expired Submit within 60s of receiving
Balance draining fast Duplicate solves Deduplicate requests; cache tokens

FAQ

What's the fastest CAPTCHA type to solve?

Image/OCR CAPTCHAs solve in 3-5 seconds. reCAPTCHA v3 solves in 5-8 seconds. Use the simplest method your target site accepts.

How much does CaptchaAI cost per solve?

Pricing varies by CAPTCHA type. Image CAPTCHAs are the cheapest; Enterprise reCAPTCHA costs more. Check current pricing at captchaai.com.

Can I preload CAPTCHA tokens?

Yes. Submit solves ahead of time and cache the tokens. This reduces perceived latency to near-zero.

Discussions (0)

No comments yet.