Tutorials

Notion API + CaptchaAI: Automated Data Entry with CAPTCHA Handling

Notion's API lets you read and write to databases programmatically, making it a flexible task management layer for automation workflows. When those workflows encounter CAPTCHA-protected forms, CaptchaAI handles the solving. This guide shows how to use a Notion database as a CAPTCHA task queue — storing URLs and sitekeys, triggering solves, and writing results back.

Real-World Scenario

Your team maintains a Notion database of website URLs that need periodic data extraction. Some of these URLs are CAPTCHA-protected. A script:

  1. Reads pending tasks from Notion
  2. Solves CAPTCHAs via CaptchaAI
  3. Updates each Notion record with the token and status

Prerequisites

  • Notion integration (Internal integration via developers.notion.com)
  • A Notion database shared with the integration
  • CaptchaAI API key
  • Python 3.8+ or Node.js 18+

Notion Database Setup

Create a Notion database with these properties:

Property Type Purpose
Name Title Task identifier
URL URL Target page with CAPTCHA
Sitekey Rich text reCAPTCHA sitekey
Status Select Pending, Solving, Solved, Failed
Token Rich text Solved CAPTCHA token
Solved At Date Solve timestamp
Error Rich text Error message if failed

Share the database with your Notion integration.

Python Implementation

# notion_captcha_worker.py
import os
import time
import requests

NOTION_TOKEN = os.environ.get("NOTION_TOKEN")
NOTION_DB_ID = os.environ.get("NOTION_DB_ID")
CAPTCHAAI_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")

NOTION_HEADERS = {
    "Authorization": f"Bearer {NOTION_TOKEN}",
    "Content-Type": "application/json",
    "Notion-Version": "2022-06-28",
}

def get_pending_tasks():
    """Fetch tasks with Status = Pending from Notion."""
    url = f"https://api.notion.com/v1/databases/{NOTION_DB_ID}/query"
    payload = {
        "filter": {
            "property": "Status",
            "select": {"equals": "Pending"},
        }
    }
    resp = requests.post(url, headers=NOTION_HEADERS, json=payload)
    resp.raise_for_status()
    return resp.json()["results"]

def update_task(page_id, properties):
    """Update a Notion page with new property values."""
    url = f"https://api.notion.com/v1/pages/{page_id}"
    payload = {"properties": properties}
    resp = requests.patch(url, headers=NOTION_HEADERS, json=payload)
    resp.raise_for_status()

def set_status(page_id, status, token=None, error=None):
    """Update task status in Notion."""
    props = {"Status": {"select": {"name": status}}}

    if token:
        props["Token"] = {"rich_text": [{"text": {"content": token[:2000]}}]}
        props["Solved At"] = {"date": {"start": time.strftime("%Y-%m-%dT%H:%M:%S")}}

    if error:
        props["Error"] = {"rich_text": [{"text": {"content": error[:200]}}]}

    update_task(page_id, props)

def solve_captcha(sitekey, pageurl):
    """Submit to CaptchaAI and poll for result."""
    # Submit
    resp = requests.get("https://ocr.captchaai.com/in.php", params={
        "key": CAPTCHAAI_KEY,
        "method": "userrecaptcha",
        "googlekey": sitekey,
        "pageurl": pageurl,
        "json": "1",
    })
    result = resp.json()

    if result.get("status") != 1:
        raise Exception(f"Submit failed: {result.get('request')}")

    task_id = result["request"]

    # Poll
    time.sleep(15)
    for _ in range(25):
        poll = requests.get("https://ocr.captchaai.com/res.php", params={
            "key": CAPTCHAAI_KEY,
            "action": "get",
            "id": task_id,
            "json": "1",
        })
        poll_result = poll.json()

        if poll_result.get("status") == 1:
            return poll_result["request"]
        if poll_result.get("request") != "CAPCHA_NOT_READY":
            raise Exception(f"Solve failed: {poll_result.get('request')}")

        time.sleep(5)

    raise Exception("Polling timeout")

def extract_property(page, prop_name, prop_type="rich_text"):
    """Extract a property value from a Notion page."""
    prop = page["properties"].get(prop_name, {})
    if prop_type == "rich_text":
        texts = prop.get("rich_text", [])
        return texts[0]["plain_text"] if texts else ""
    elif prop_type == "url":
        return prop.get("url", "")
    return ""

def main():
    tasks = get_pending_tasks()
    print(f"Found {len(tasks)} pending tasks")

    for task in tasks:
        page_id = task["id"]
        sitekey = extract_property(task, "Sitekey")
        pageurl = extract_property(task, "URL", "url")

        if not sitekey or not pageurl:
            set_status(page_id, "Failed", error="Missing sitekey or URL")
            continue

        print(f"Solving: {pageurl}")
        set_status(page_id, "Solving")

        try:
            token = solve_captcha(sitekey, pageurl)
            set_status(page_id, "Solved", token=token)
            print(f"  Solved successfully")
        except Exception as e:
            set_status(page_id, "Failed", error=str(e))
            print(f"  Failed: {e}")

        time.sleep(1)  # Rate limit for Notion API

    print("All tasks processed")

if __name__ == "__main__":
    main()

JavaScript Implementation

// notion_captcha_worker.js
const { Client } = require('@notionhq/client');
const axios = require('axios');

const notion = new Client({ auth: process.env.NOTION_TOKEN });
const DB_ID = process.env.NOTION_DB_ID;
const API_KEY = process.env.CAPTCHAAI_KEY || 'YOUR_API_KEY';

async function getPendingTasks() {
  const response = await notion.databases.query({
    database_id: DB_ID,
    filter: { property: 'Status', select: { equals: 'Pending' } },
  });
  return response.results;
}

async function updateTask(pageId, status, token, error) {
  const properties = {
    Status: { select: { name: status } },
  };
  if (token) {
    properties.Token = { rich_text: [{ text: { content: token.slice(0, 2000) } }] };
    properties['Solved At'] = { date: { start: new Date().toISOString() } };
  }
  if (error) {
    properties.Error = { rich_text: [{ text: { content: error.slice(0, 200) } }] };
  }
  await notion.pages.update({ page_id: pageId, properties });
}

async function solveCaptcha(sitekey, pageurl) {
  const submit = await axios.get('https://ocr.captchaai.com/in.php', {
    params: {
      key: API_KEY, method: 'userrecaptcha',
      googlekey: sitekey, pageurl, json: '1',
    },
  });
  if (submit.data.status !== 1) throw new Error(submit.data.request);

  await new Promise(r => setTimeout(r, 15000));

  for (let i = 0; i < 25; i++) {
    const poll = await axios.get('https://ocr.captchaai.com/res.php', {
      params: { key: API_KEY, action: 'get', id: submit.data.request, json: '1' },
    });
    if (poll.data.status === 1) return poll.data.request;
    if (poll.data.request !== 'CAPCHA_NOT_READY') throw new Error(poll.data.request);
    await new Promise(r => setTimeout(r, 5000));
  }
  throw new Error('Timeout');
}

async function main() {
  const tasks = await getPendingTasks();
  console.log(`Found ${tasks.length} pending tasks`);

  for (const task of tasks) {
    const sitekey = task.properties.Sitekey?.rich_text?.[0]?.plain_text;
    const pageurl = task.properties.URL?.url;

    if (!sitekey || !pageurl) {
      await updateTask(task.id, 'Failed', null, 'Missing sitekey or URL');
      continue;
    }

    console.log(`Solving: ${pageurl}`);
    await updateTask(task.id, 'Solving');

    try {
      const token = await solveCaptcha(sitekey, pageurl);
      await updateTask(task.id, 'Solved', token);
      console.log('  Solved');
    } catch (e) {
      await updateTask(task.id, 'Failed', null, e.message);
      console.log(`  Failed: ${e.message}`);
    }

    await new Promise(r => setTimeout(r, 1000));
  }
}

main().catch(console.error);

Troubleshooting

Problem Cause Fix
401 Unauthorized from Notion Integration not connected to database Share the database with your integration in Notion
Property names don't match Case sensitivity Notion property names are case-sensitive — match exactly
Token truncated Notion rich_text limit of 2000 chars CAPTCHA tokens are typically <1000 chars; this shouldn't be an issue
Notion rate limit (429) Too many API calls Add 1-second delays between Notion updates

FAQ

Can I run this as a scheduled job?

Yes. Use cron (Linux), Task Scheduler (Windows), or a cloud scheduler (AWS EventBridge, Google Cloud Scheduler) to run the script on a schedule.

How do I set up the Notion integration?

Go to notion.so/my-integrations, create a new internal integration, copy the secret token, and share your database with the integration.

Can I process other CAPTCHA types?

Yes. Add a "CAPTCHA Type" property to the Notion database and modify the solve function to use the appropriate CaptchaAI method (turnstile, geetest, base64, etc.).

Next Steps

Turn your Notion databases into automated CAPTCHA-solving queues — get your CaptchaAI API key.

Related guides:

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.