Tutorials

CAPTCHA Session State Management Across Distributed Workers

When multiple workers solve CAPTCHAs for the same site, they share a problem: each worker has its own session. The target site sees different cookies, different IPs, and different browser fingerprints. Session state management synchronizes context across workers so solves are consistent and the target site sees coherent sessions.

The Session State Problem

Worker 1 → Login → Solve CAPTCHA → Get cookie A → Submit form ✅
Worker 2 → New session → Solve CAPTCHA → Get cookie B → Submit form ✅
Worker 3 → Reuse cookie A? → Cookie expired → Solve CAPTCHA → Fail ❌

Without shared state, workers waste solves on expired sessions and produce inconsistent behavior that target sites can detect.

What Session State Includes

State Component Lifetime Sharing Strategy
Authentication cookies Minutes to hours Redis with TTL
CAPTCHA tokens 90–300 seconds Redis list (short TTL)
cf_clearance cookies ~30 minutes Redis hash
CSRF tokens Per page load Don't share — each worker gets its own
Browser fingerprint Permanent Configuration, not runtime state
Proxy assignment Per session Redis-backed proxy pool

Architecture

┌──────────────────────────────────────┐
│          Session State Store          │
│              (Redis)                  │
│                                      │
│  cookies:{domain} → Hash             │
│  tokens:{sitekey} → List             │
│  proxies:pool → Set                  │
│  locks:{domain}:{worker} → String    │
└─────┬──────────┬──────────┬──────────┘
      │          │          │
  ┌───▼───┐  ┌──▼────┐  ┌──▼────┐
  │Worker1│  │Worker2│  │Worker3│
  └───────┘  └───────┘  └───────┘

Python Implementation

Session Store

import os
import json
import time
import redis
import requests
from datetime import datetime, timezone

r = redis.Redis(
    host=os.environ.get("REDIS_HOST", "localhost"),
    port=int(os.environ.get("REDIS_PORT", 6379)),
    decode_responses=True
)

API_KEY = os.environ["CAPTCHAAI_API_KEY"]


class SessionStore:
    """Shared session state across distributed workers."""

    def __init__(self, domain):
        self.domain = domain
        self.cookie_key = f"session:cookies:{domain}"
        self.token_key = f"session:tokens:{domain}"

    def save_cookies(self, cookies, ttl=1800):
        """Store cookies from a successful session."""
        cookie_data = {name: value for name, value in cookies.items()}
        r.hset(self.cookie_key, mapping=cookie_data)
        r.expire(self.cookie_key, ttl)

    def get_cookies(self):
        """Retrieve shared cookies."""
        cookies = r.hgetall(self.cookie_key)
        return cookies if cookies else None

    def save_token(self, sitekey, token, ttl=80):
        """Store a solved CAPTCHA token."""
        key = f"{self.token_key}:{sitekey}"
        r.rpush(key, token)
        r.expire(key, ttl)

    def get_token(self, sitekey):
        """Pop a cached CAPTCHA token."""
        key = f"{self.token_key}:{sitekey}"
        return r.lpop(key)

    def acquire_session_lock(self, worker_id, ttl=300):
        """Ensure only one worker manages the session at a time."""
        lock_key = f"session:lock:{self.domain}"
        return r.set(lock_key, worker_id, nx=True, ex=ttl)

    def release_session_lock(self, worker_id):
        """Release session lock if this worker holds it."""
        lock_key = f"session:lock:{self.domain}"
        current = r.get(lock_key)
        if current == worker_id:
            r.delete(lock_key)

Worker with Shared State

class CaptchaWorker:
    def __init__(self, worker_id, domain):
        self.worker_id = worker_id
        self.store = SessionStore(domain)
        self.session = requests.Session()

    def setup_session(self):
        """Load shared cookies into this worker's session."""
        cookies = self.store.get_cookies()
        if cookies:
            for name, value in cookies.items():
                self.session.cookies.set(name, value)
            return True
        return False

    def solve_captcha(self, sitekey, pageurl):
        """Solve with token cache and session sharing."""
        # Check for cached token
        cached = self.store.get_token(sitekey)
        if cached:
            return {"solution": cached, "source": "cache"}

        # Solve via CaptchaAI
        resp = requests.post("https://ocr.captchaai.com/in.php", data={
            "key": API_KEY,
            "method": "userrecaptcha",
            "googlekey": sitekey,
            "pageurl": pageurl,
            "json": 1
        })
        data = resp.json()
        if data.get("status") != 1:
            return {"error": data.get("request")}

        captcha_id = data["request"]

        for _ in range(60):
            time.sleep(5)
            result = requests.get("https://ocr.captchaai.com/res.php", params={
                "key": API_KEY, "action": "get",
                "id": captcha_id, "json": 1
            }).json()

            if result.get("status") == 1:
                token = result["request"]
                self.store.save_token(sitekey, token)
                return {"solution": token, "source": "api"}

            if result.get("request") != "CAPCHA_NOT_READY":
                return {"error": result.get("request")}

        return {"error": "TIMEOUT"}

    def process_page(self, url, sitekey):
        """Full workflow: setup session → solve CAPTCHA → submit."""
        # Load shared session
        self.setup_session()

        # Solve CAPTCHA
        result = self.solve_captcha(sitekey, url)
        if "error" in result:
            return result

        # Submit form with token
        response = self.session.post(url, data={
            "g-recaptcha-response": result["solution"]
        })

        # Share resulting cookies
        self.store.save_cookies(dict(self.session.cookies))

        return {"status": response.status_code, "source": result["source"]}

Proxy Pool Management

class ProxyPool:
    """Distribute proxies across workers to avoid IP conflicts."""

    def __init__(self, proxies):
        self.pool_key = "session:proxy_pool"
        self.assigned_key = "session:proxy_assigned"
        # Initialize pool
        for proxy in proxies:
            r.sadd(self.pool_key, proxy)

    def acquire_proxy(self, worker_id, ttl=600):
        """Assign an unused proxy to a worker."""
        # Check if worker already has one
        existing = r.hget(self.assigned_key, worker_id)
        if existing:
            return existing

        # Pop from available pool
        proxy = r.spop(self.pool_key)
        if proxy:
            r.hset(self.assigned_key, worker_id, proxy)
            r.expire(self.assigned_key, ttl)
            return proxy
        return None

    def release_proxy(self, worker_id):
        """Return proxy to the pool."""
        proxy = r.hget(self.assigned_key, worker_id)
        if proxy:
            r.sadd(self.pool_key, proxy)
            r.hdel(self.assigned_key, worker_id)

JavaScript Implementation

const Redis = require("ioredis");
const axios = require("axios");

const redis = new Redis(process.env.REDIS_URL || "redis://localhost:6379");
const API_KEY = process.env.CAPTCHAAI_API_KEY;

class SessionStore {
  constructor(domain) {
    this.domain = domain;
    this.cookieKey = `session:cookies:${domain}`;
    this.tokenKey = `session:tokens:${domain}`;
  }

  async saveCookies(cookies, ttl = 1800) {
    const entries = Object.entries(cookies).flat();
    if (entries.length > 0) {
      await redis.hset(this.cookieKey, ...entries);
      await redis.expire(this.cookieKey, ttl);
    }
  }

  async getCookies() {
    return await redis.hgetall(this.cookieKey);
  }

  async saveToken(sitekey, token, ttl = 80) {
    const key = `${this.tokenKey}:${sitekey}`;
    await redis.rpush(key, token);
    await redis.expire(key, ttl);
  }

  async getToken(sitekey) {
    return await redis.lpop(`${this.tokenKey}:${sitekey}`);
  }

  async acquireLock(workerId, ttl = 300) {
    const result = await redis.set(`session:lock:${this.domain}`, workerId, "NX", "EX", ttl);
    return result === "OK";
  }

  async releaseLock(workerId) {
    const current = await redis.get(`session:lock:${this.domain}`);
    if (current === workerId) await redis.del(`session:lock:${this.domain}`);
  }
}

async function workerSolve(store, sitekey, pageurl) {
  const cached = await store.getToken(sitekey);
  if (cached) return { solution: cached, source: "cache" };

  const submit = await axios.post("https://ocr.captchaai.com/in.php", null, {
    params: { key: API_KEY, method: "userrecaptcha", googlekey: sitekey, pageurl, json: 1 },
  });
  if (submit.data.status !== 1) return { error: submit.data.request };

  const captchaId = submit.data.request;
  for (let i = 0; i < 60; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    const poll = await axios.get("https://ocr.captchaai.com/res.php", {
      params: { key: API_KEY, action: "get", id: captchaId, json: 1 },
    });
    if (poll.data.status === 1) {
      await store.saveToken(sitekey, poll.data.request);
      return { solution: poll.data.request, source: "api" };
    }
    if (poll.data.request !== "CAPCHA_NOT_READY") return { error: poll.data.request };
  }
  return { error: "TIMEOUT" };
}

State Management Patterns

Pattern When to Use
Session lock One worker manages login, others consume cookies
Token pool High-throughput: pre-solve and distribute tokens
Cookie sharing Workers need authenticated sessions
Proxy affinity Target site tracks IP-session binding

Troubleshooting

Issue Cause Fix
Workers get different sessions Cookies not shared via Redis Verify save_cookies is called after successful requests
Token expired before other worker uses it TTL too long or network delay Reduce TTL safety margin; use tokens within 10 seconds of retrieval
Session lock never released Worker crashed TTL on lock key auto-releases (300s default)
Target site blocks workers All workers using same proxy Use proxy pool with per-worker affinity

FAQ

Should every worker share cookies?

Only for sites that require authenticated sessions. For stateless CAPTCHA solving (submit sitekey → get token), workers don't need shared cookies — just shared tokens.

How do I handle session expiration?

Set Redis TTL slightly shorter than the session lifetime. When cookies expire, one worker acquires the session lock, re-authenticates, and stores fresh cookies for the others.

What about browser-based sessions (Puppeteer/Playwright)?

Serialize browser cookies with page.cookies() and store in Redis. Other workers load them with page.setCookie(). This works across separate browser instances and machines.

Next Steps

Coordinate your distributed CAPTCHA workers efficiently — get your CaptchaAI API key.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.