Tutorials

Batch reCAPTCHA Solving for Form Submission Pipelines

When you need to submit multiple forms protected by reCAPTCHA, solving one at a time is slow. A pipeline approach pre-solves tokens while previous submissions are in progress, keeping a buffer of fresh tokens ready for immediate use.

The Pipeline Architecture

Token Solver (background)        Form Submitter (foreground)
─────────────────────────        ──────────────────────────
Submit CAPTCHA task  ──┐
Submit CAPTCHA task  ──┤
Submit CAPTCHA task  ──┼──→  Token Queue  ──→  Take token → Submit form
Poll results         ──┤     (max age      ──→  Take token → Submit form
Submit more tasks    ──┘      < 110s)       ──→  Take token → Submit form

Key constraints:

  • reCAPTCHA v2 tokens expire in ~120 seconds
  • Use tokens within 110 seconds to have a safety margin
  • Pre-solve only as many tokens as you can consume before expiration

Python Pipeline

import time
import threading
import queue
import requests

API_KEY = "YOUR_API_KEY"
SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"
TOKEN_MAX_AGE = 110  # seconds — use before 120s expiry


class TokenSolver(threading.Thread):
    """Background thread that keeps a queue of fresh reCAPTCHA tokens."""

    def __init__(self, sitekey, pageurl, token_queue, buffer_size=3):
        super().__init__(daemon=True)
        self.sitekey = sitekey
        self.pageurl = pageurl
        self.token_queue = token_queue
        self.buffer_size = buffer_size
        self.running = True

    def run(self):
        while self.running:
            # Keep the queue at buffer_size
            if self.token_queue.qsize() < self.buffer_size:
                token, created_at = self._solve()
                if token:
                    self.token_queue.put((token, created_at))
            else:
                time.sleep(1)

    def _solve(self):
        """Submit and poll a single reCAPTCHA task."""
        try:
            response = requests.post(SUBMIT_URL, data={
                "key": API_KEY,
                "method": "userrecaptcha",
                "googlekey": self.sitekey,
                "pageurl": self.pageurl,
                "json": 1,
            }, timeout=30)
            result = response.json()

            if result.get("status") != 1:
                print(f"Submit error: {result.get('request')}")
                return None, None

            task_id = result["request"]

            # Poll
            for _ in range(60):
                time.sleep(5)
                poll = requests.get(RESULT_URL, params={
                    "key": API_KEY, "action": "get",
                    "id": task_id, "json": 1,
                }, timeout=15).json()

                if poll.get("request") == "CAPCHA_NOT_READY":
                    continue
                if poll.get("status") == 1:
                    return poll["request"], time.monotonic()
                print(f"Poll error: {poll.get('request')}")
                return None, None

        except requests.RequestException as e:
            print(f"Network error: {e}")
        return None, None

    def stop(self):
        self.running = False


def get_fresh_token(token_queue, timeout=120):
    """Get a token from the queue, discarding expired ones."""
    while True:
        try:
            token, created_at = token_queue.get(timeout=timeout)
            age = time.monotonic() - created_at
            if age < TOKEN_MAX_AGE:
                return token
            print(f"Discarded expired token (age={age:.0f}s)")
        except queue.Empty:
            return None


def submit_form(form_data, token, session):
    """Submit a form with the reCAPTCHA token."""
    form_data["g-recaptcha-response"] = token
    response = session.post(
        form_data.pop("__submit_url"),
        data=form_data,
        timeout=30,
    )
    return response.status_code, response.text[:200]


def run_pipeline(sitekey, pageurl, forms, buffer_size=3):
    """Process multiple form submissions with pre-solved tokens."""
    token_queue = queue.Queue(maxsize=buffer_size + 2)

    # Start background solver
    solver = TokenSolver(sitekey, pageurl, token_queue, buffer_size)
    solver.start()

    session = requests.Session()
    results = []

    print(f"Processing {len(forms)} forms with buffer_size={buffer_size}")

    for i, form_data in enumerate(forms):
        # Get a fresh token
        token = get_fresh_token(token_queue)
        if not token:
            results.append({"index": i, "status": "error", "error": "No token available"})
            continue

        # Submit the form
        try:
            status_code, body = submit_form({**form_data}, token, session)
            results.append({
                "index": i,
                "status": "submitted",
                "http_status": status_code,
            })
            print(f"  [{i + 1}/{len(forms)}] Submitted — HTTP {status_code}")
        except requests.RequestException as e:
            results.append({"index": i, "status": "error", "error": str(e)})
            print(f"  [{i + 1}/{len(forms)}] Failed — {e}")

    solver.stop()
    return results


# Usage
forms = [
    {
        "__submit_url": "https://example.com/apply",
        "name": "User 1",
        "email": "user1@example.com",
    },
    {
        "__submit_url": "https://example.com/apply",
        "name": "User 2",
        "email": "user2@example.com",
    },
    # Add more forms...
]

results = run_pipeline(
    sitekey="6LeIxAcTAAAAAJcZVRqyHh71UMIEGNQ_MXjiZKhI",
    pageurl="https://example.com/apply",
    forms=forms,
    buffer_size=3,
)

JavaScript Pipeline (Node.js)

const API_KEY = "YOUR_API_KEY";
const SUBMIT_URL = "https://ocr.captchaai.com/in.php";
const RESULT_URL = "https://ocr.captchaai.com/res.php";
const TOKEN_MAX_AGE = 110000; // 110 seconds in ms

class TokenPool {
  constructor(sitekey, pageurl, bufferSize = 3) {
    this.sitekey = sitekey;
    this.pageurl = pageurl;
    this.bufferSize = bufferSize;
    this.tokens = []; // { token, createdAt }
    this.running = false;
    this.pendingSolves = 0;
  }

  start() {
    this.running = true;
    this._refill();
  }

  stop() {
    this.running = false;
  }

  async _refill() {
    while (this.running) {
      // Remove expired tokens
      this.tokens = this.tokens.filter(
        (t) => Date.now() - t.createdAt < TOKEN_MAX_AGE
      );

      const needed = this.bufferSize - this.tokens.length - this.pendingSolves;
      for (let i = 0; i < needed; i++) {
        this.pendingSolves++;
        this._solve().then((result) => {
          this.pendingSolves--;
          if (result) this.tokens.push(result);
        });
      }

      await new Promise((r) => setTimeout(r, 2000));
    }
  }

  async _solve() {
    try {
      const response = await fetch(SUBMIT_URL, {
        method: "POST",
        body: new URLSearchParams({
          key: API_KEY,
          method: "userrecaptcha",
          googlekey: this.sitekey,
          pageurl: this.pageurl,
          json: 1,
        }),
      });
      const result = await response.json();
      if (result.status !== 1) return null;

      const taskId = result.request;

      for (let i = 0; i < 60; i++) {
        await new Promise((r) => setTimeout(r, 5000));
        const url = new URL(RESULT_URL);
        url.searchParams.set("key", API_KEY);
        url.searchParams.set("action", "get");
        url.searchParams.set("id", taskId);
        url.searchParams.set("json", "1");

        const poll = await (await fetch(url)).json();
        if (poll.request === "CAPCHA_NOT_READY") continue;
        if (poll.status === 1) return { token: poll.request, createdAt: Date.now() };
        return null;
      }
    } catch {
      return null;
    }
    return null;
  }

  async getToken(timeoutMs = 120000) {
    const start = Date.now();
    while (Date.now() - start < timeoutMs) {
      // Find a fresh token
      const index = this.tokens.findIndex(
        (t) => Date.now() - t.createdAt < TOKEN_MAX_AGE
      );
      if (index >= 0) {
        return this.tokens.splice(index, 1)[0].token;
      }
      await new Promise((r) => setTimeout(r, 1000));
    }
    return null;
  }
}

async function submitForm(url, formData, token) {
  const body = new URLSearchParams({ ...formData, "g-recaptcha-response": token });
  const response = await fetch(url, { method: "POST", body });
  return { status: response.status, ok: response.ok };
}

async function runPipeline(sitekey, pageurl, forms, bufferSize = 3) {
  const pool = new TokenPool(sitekey, pageurl, bufferSize);
  pool.start();

  const results = [];

  for (let i = 0; i < forms.length; i++) {
    const token = await pool.getToken();
    if (!token) {
      results.push({ index: i, status: "error", error: "No token" });
      console.log(`  [${i + 1}/${forms.length}] Failed — no token`);
      continue;
    }

    try {
      const { submitUrl, ...formData } = forms[i];
      const result = await submitForm(submitUrl, formData, token);
      results.push({ index: i, status: "submitted", httpStatus: result.status });
      console.log(`  [${i + 1}/${forms.length}] Submitted — HTTP ${result.status}`);
    } catch (err) {
      results.push({ index: i, status: "error", error: err.message });
    }
  }

  pool.stop();
  return results;
}

// Usage
runPipeline(
  "6LeIxAcTAAAAAJcZVRqyHh71UMIEGNQ_MXjiZKhI",
  "https://example.com/apply",
  [
    { submitUrl: "https://example.com/apply", name: "User 1", email: "user1@example.com" },
    { submitUrl: "https://example.com/apply", name: "User 2", email: "user2@example.com" },
  ],
  3
);

Buffer Size Tuning

Form Submission Speed Recommended Buffer Rationale
< 5 seconds per form 2–3 Tokens consumed fast — small buffer stays fresh
5–30 seconds per form 3–5 Moderate consumption — balance freshness vs. availability
> 30 seconds per form 1–2 Slow consumption — tokens may expire in larger buffers

Formula: buffer_size = min(5, ceil(TOKEN_MAX_AGE / avg_form_time) - 1)

Token Freshness Strategy

Strategy When to Use
Pre-solve buffer Fast form submission (< 30s per form)
Solve-on-demand Slow submission (> 60s per form) — solve when needed
Hybrid Start pre-solving when queue drops below threshold

Troubleshooting

Issue Cause Fix
Tokens expiring before use Buffer too large for submission speed Reduce buffer_size; increase submission throughput
Forms rejected despite valid token Token used in different session/IP Ensure form submission uses the same IP as the CAPTCHA pageurl
"Invalid reCAPTCHA" on form submit Token already expired or used twice Check token age < 110s; ensure each token is used exactly once
Pipeline stalls waiting for tokens All solves failing or slow Check API key balance; verify sitekey and pageurl
Memory usage growing Expired tokens accumulating The pool code removes expired tokens on each check — verify cleanup runs

FAQ

Can I reuse a reCAPTCHA token for multiple form submissions?

No. Each reCAPTCHA token is single-use. Once submitted with a form, it's consumed. You need one token per form submission.

How do I handle forms on different pages with different sitekeys?

Create separate TokenPool instances for each sitekey/pageurl combination. Each pool manages its own token buffer independently.

What if the form requires additional CAPTCHA validation after submission?

Some forms re-challenge on server-side validation failure. In that case, implement a retry with a new token: detect the validation error in the response, get another token, and resubmit.

Next Steps

Start building your form submission pipeline — get your CaptchaAI API key and implement the token pool.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.