Tutorials

Bulkhead Pattern: Isolating CAPTCHA Solving Failures

Your scraper solves reCAPTCHA v2, Turnstile, and image CAPTCHAs concurrently. When the reCAPTCHA service slows down, all 50 concurrent slots fill with waiting reCAPTCHA tasks — Turnstile and image solves queue behind them. The bulkhead pattern partitions resources into isolated compartments so one failing type can't starve the others.

How Bulkheads Work

Named after ship compartments that contain flooding, the pattern assigns each CAPTCHA type its own resource pool:

Pool Max Concurrent Queued Effect of Failure
reCAPTCHA 20 10 Only reCAPTCHA tasks slow down
Turnstile 15 10 Turnstile keeps solving normally
Image 10 20 Image queue stays independent
Default 5 5 Unknown types get minimal resources

Python: Semaphore Bulkheads

import asyncio
import aiohttp
import time
from dataclasses import dataclass

API_KEY = "YOUR_API_KEY"
SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"


@dataclass
class BulkheadConfig:
    max_concurrent: int
    max_queued: int
    timeout: int = 180


class Bulkhead:
    """Resource-limited compartment for a CAPTCHA type."""

    def __init__(self, name: str, config: BulkheadConfig):
        self.name = name
        self._semaphore = asyncio.Semaphore(config.max_concurrent)
        self._max_queued = config.max_queued
        self._queued = 0
        self._active = 0
        self._rejected = 0
        self.timeout = config.timeout

    @property
    def stats(self) -> dict:
        return {
            "name": self.name,
            "active": self._active,
            "queued": self._queued,
            "rejected": self._rejected,
        }

    async def execute(self, coro):
        """Run a coroutine within the bulkhead's resource limits."""
        if self._queued >= self._max_queued:
            self._rejected += 1
            raise BulkheadFullError(
                f"Bulkhead '{self.name}' full: {self._active} active, "
                f"{self._queued} queued (max {self._max_queued})"
            )

        self._queued += 1
        try:
            await self._semaphore.acquire()
            self._queued -= 1
            self._active += 1
            try:
                return await asyncio.wait_for(coro, timeout=self.timeout)
            finally:
                self._active -= 1
                self._semaphore.release()
        except asyncio.TimeoutError:
            self._queued -= 1
            raise


class BulkheadFullError(Exception):
    pass


class IsolatedCaptchaSolver:
    """CAPTCHA solver with bulkhead isolation per type."""

    def __init__(self, api_key: str, bulkheads: dict[str, BulkheadConfig] | None = None):
        self.api_key = api_key
        defaults = {
            "recaptcha": BulkheadConfig(max_concurrent=20, max_queued=10),
            "turnstile": BulkheadConfig(max_concurrent=15, max_queued=10),
            "image": BulkheadConfig(max_concurrent=10, max_queued=20),
            "default": BulkheadConfig(max_concurrent=5, max_queued=5),
        }
        configs = {**defaults, **(bulkheads or {})}
        self._bulkheads = {name: Bulkhead(name, cfg) for name, cfg in configs.items()}

    def _get_bulkhead(self, method: str) -> Bulkhead:
        if "recaptcha" in method:
            return self._bulkheads["recaptcha"]
        if method == "turnstile":
            return self._bulkheads["turnstile"]
        if method in ("base64", "post"):
            return self._bulkheads["image"]
        return self._bulkheads["default"]

    async def _submit_and_poll(self, session: aiohttp.ClientSession, params: dict) -> str:
        params["key"] = self.api_key
        params["json"] = 1

        async with session.post(SUBMIT_URL, data=params) as resp:
            data = await resp.json(content_type=None)
        if data.get("status") != 1:
            raise RuntimeError(f"Submit failed: {data.get('request')}")

        task_id = data["request"]
        for _ in range(60):
            await asyncio.sleep(5)
            poll_params = {"key": self.api_key, "action": "get", "id": task_id, "json": 1}
            async with session.get(RESULT_URL, params=poll_params) as resp:
                poll = await resp.json(content_type=None)

            if poll.get("request") == "CAPCHA_NOT_READY":
                continue
            if poll.get("status") == 1:
                return poll["request"]
            raise RuntimeError(f"Solve failed: {poll.get('request')}")

        raise RuntimeError("Timeout")

    async def solve(self, params: dict) -> str:
        """Solve a CAPTCHA within its type-specific bulkhead."""
        method = params.get("method", "default")
        bulkhead = self._get_bulkhead(method)

        async with aiohttp.ClientSession() as session:
            return await bulkhead.execute(
                self._submit_and_poll(session, params)
            )

    def get_stats(self) -> list[dict]:
        return [bh.stats for bh in self._bulkheads.values()]


# --- Usage ---

async def main():
    solver = IsolatedCaptchaSolver("YOUR_API_KEY")

    tasks = []
    # 30 reCAPTCHA — fills the recaptcha bulkhead
    for _ in range(30):
        tasks.append(solver.solve({
            "method": "userrecaptcha",
            "googlekey": "SITEKEY_A",
            "pageurl": "https://site-a.com",
        }))

    # 10 Turnstile — runs in its own pool, unaffected by reCAPTCHA
    for _ in range(10):
        tasks.append(solver.solve({
            "method": "turnstile",
            "sitekey": "SITEKEY_B",
            "pageurl": "https://site-b.com",
        }))

    results = await asyncio.gather(*tasks, return_exceptions=True)

    solved = sum(1 for r in results if isinstance(r, str))
    rejected = sum(1 for r in results if isinstance(r, BulkheadFullError))
    errors = sum(1 for r in results if isinstance(r, Exception) and not isinstance(r, BulkheadFullError))

    print(f"Solved: {solved}, Rejected: {rejected}, Errors: {errors}")
    for stat in solver.get_stats():
        print(f"  {stat['name']}: rejected={stat['rejected']}")


asyncio.run(main())

JavaScript: Bulkhead with Concurrency Limiter

const API_KEY = "YOUR_API_KEY";
const SUBMIT_URL = "https://ocr.captchaai.com/in.php";
const RESULT_URL = "https://ocr.captchaai.com/res.php";

class Bulkhead {
  constructor(name, maxConcurrent, maxQueued) {
    this.name = name;
    this.maxConcurrent = maxConcurrent;
    this.maxQueued = maxQueued;
    this.active = 0;
    this.queue = [];
    this.rejected = 0;
  }

  async execute(fn) {
    if (this.active >= this.maxConcurrent) {
      if (this.queue.length >= this.maxQueued) {
        this.rejected++;
        throw new Error(`Bulkhead '${this.name}' full`);
      }
      await new Promise((resolve, reject) => {
        this.queue.push({ resolve, reject });
      });
    }

    this.active++;
    try {
      return await fn();
    } finally {
      this.active--;
      if (this.queue.length > 0) {
        this.queue.shift().resolve();
      }
    }
  }
}

const bulkheads = {
  recaptcha: new Bulkhead("recaptcha", 20, 10),
  turnstile: new Bulkhead("turnstile", 15, 10),
  image: new Bulkhead("image", 10, 20),
  default: new Bulkhead("default", 5, 5),
};

function getBulkhead(method) {
  if (method.includes("recaptcha")) return bulkheads.recaptcha;
  if (method === "turnstile") return bulkheads.turnstile;
  if (method === "base64") return bulkheads.image;
  return bulkheads.default;
}

async function submitAndPoll(params) {
  const body = new URLSearchParams({ key: API_KEY, json: "1", ...params });
  const resp = await (await fetch(SUBMIT_URL, { method: "POST", body })).json();
  if (resp.status !== 1) throw new Error(`Submit: ${resp.request}`);

  const taskId = resp.request;
  for (let i = 0; i < 60; i++) {
    await new Promise((r) => setTimeout(r, 5000));
    const url = `${RESULT_URL}?key=${API_KEY}&action=get&id=${taskId}&json=1`;
    const poll = await (await fetch(url)).json();
    if (poll.request === "CAPCHA_NOT_READY") continue;
    if (poll.status === 1) return poll.request;
    throw new Error(`Solve: ${poll.request}`);
  }
  throw new Error("Timeout");
}

async function solve(params) {
  const bulkhead = getBulkhead(params.method);
  return bulkhead.execute(() => submitAndPoll(params));
}

// Usage — Turnstile solves continue even if reCAPTCHA is overloaded
const results = await Promise.allSettled([
  ...Array(30).fill(null).map(() =>
    solve({ method: "userrecaptcha", googlekey: "SITEKEY_A", pageurl: "https://site-a.com" })
  ),
  ...Array(10).fill(null).map(() =>
    solve({ method: "turnstile", sitekey: "SITEKEY_B", pageurl: "https://site-b.com" })
  ),
]);

const fulfilled = results.filter((r) => r.status === "fulfilled").length;
const rejected = results.filter((r) => r.status === "rejected").length;
console.log(`Solved: ${fulfilled}, Rejected: ${rejected}`);

Sizing Bulkheads

Factor Guidance
Average solve time Longer solve times need more slots for same throughput
Request volume per type Allocate more slots to higher-volume types
Failure tolerance Smaller pools = less resource waste during outages
API rate limits Total across all pools shouldn't exceed your rate limit

Troubleshooting

Issue Cause Fix
All requests rejected Bulkhead too small for traffic Increase max_concurrent or max_queued
One type still affects others Wrong bulkhead mapping Verify _get_bulkhead routes the method correctly
Queue grows unbounded No queue limit set Always set max_queued to prevent memory issues
Deadlock under load Semaphore not released on error Use try/finally to always release the semaphore
Stats show zero rejected Bulkhead too large Size pools based on actual traffic patterns

FAQ

How do I choose bulkhead sizes?

Start with your expected peak concurrency per type divided by your total capacity. Monitor rejection rates — if a pool rejects frequently, increase its size. If it rarely fills, reduce it and give capacity to busier pools.

Should I combine bulkheads with circuit breakers?

Yes. The bulkhead limits concurrency and the circuit breaker stops sending requests when failure rates are too high. Together, they prevent resource exhaustion and avoid hammering a failing service.

What happens to rejected requests?

Rejected tasks get a BulkheadFullError. Your caller decides what to do: retry after a delay, route to a different pool, or return a cached result. Don't silently drop rejected tasks.

Next Steps

Isolate CAPTCHA failures properly — get your CaptchaAI API key and implement bulkheads.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.