Use Cases

CAPTCHA Handling for Domain WHOIS Lookup Automation

WHOIS lookup portals protect domain registration data with reCAPTCHA v2, image CAPTCHAs, and rate limiting. Whether you're checking domain availability, verifying ownership, or monitoring expiration dates, CAPTCHAs appear after just a few queries. Here's how to handle them.

CAPTCHA Patterns on WHOIS Portals

Portal type CAPTCHA Trigger threshold
ICANN WHOIS reCAPTCHA v2 3–5 queries per session
Registrar lookup pages reCAPTCHA v2/v3 5–10 queries per minute
Regional NIR (APNIC, RIPE) Image CAPTCHA 10–20 queries
Domain auction WHOIS Cloudflare Turnstile Rapid domain checks
Bulk WHOIS tools Custom CAPTCHA After free tier limit

WHOIS Lookup with CAPTCHA Solving

import requests
import time
import re

class WhoisLookup:
    def __init__(self, api_key):
        self.api_key = api_key
        self.session = requests.Session()
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
        })

    def lookup(self, domain, whois_url):
        """Look up WHOIS data for a domain, solving CAPTCHAs as needed."""
        response = self.session.get(whois_url, params={"domain": domain})

        if self._has_recaptcha(response.text):
            site_key = self._extract_site_key(response.text)
            token = self._solve_recaptcha(site_key, whois_url)
            response = self.session.post(whois_url, data={
                "domain": domain,
                "g-recaptcha-response": token
            })

        return self._parse_whois(response.text)

    def bulk_lookup(self, domains, whois_url, delay=3):
        """Look up WHOIS for multiple domains."""
        results = {}
        for domain in domains:
            try:
                results[domain] = self.lookup(domain, whois_url)
            except Exception as e:
                results[domain] = {"error": str(e)}
            time.sleep(delay)
        return results

    def check_availability(self, domains, whois_url):
        """Check which domains are available for registration."""
        results = self.bulk_lookup(domains, whois_url)
        available = []
        taken = []

        for domain, data in results.items():
            if data.get("error") or data.get("status") == "available":
                available.append(domain)
            else:
                taken.append(domain)

        return {"available": available, "taken": taken}

    def _has_recaptcha(self, html):
        return "g-recaptcha" in html or "recaptcha" in html.lower()

    def _extract_site_key(self, html):
        match = re.search(r'data-sitekey="([^"]+)"', html)
        if match:
            return match.group(1)
        raise ValueError("reCAPTCHA site key not found")

    def _solve_recaptcha(self, site_key, page_url):
        resp = requests.post("https://ocr.captchaai.com/in.php", data={
            "key": self.api_key,
            "method": "userrecaptcha",
            "googlekey": site_key,
            "pageurl": page_url,
            "json": 1
        })
        task_id = resp.json()["request"]

        for _ in range(60):
            time.sleep(3)
            result = requests.get("https://ocr.captchaai.com/res.php", params={
                "key": self.api_key,
                "action": "get",
                "id": task_id,
                "json": 1
            })
            data = result.json()
            if data["status"] == 1:
                return data["request"]

        raise TimeoutError("reCAPTCHA solve timed out")

    def _parse_whois(self, html):
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(html, "html.parser")

        # Look for WHOIS data in pre-formatted blocks or tables
        raw_whois = soup.select_one("pre, .whois-data, #whois-result")
        if raw_whois:
            text = raw_whois.get_text()
            return self._extract_fields(text)

        return {"raw": soup.get_text()[:2000]}

    def _extract_fields(self, text):
        fields = {}
        patterns = {
            "registrar": r"Registrar:\s*(.+)",
            "created": r"Creat(?:ed|ion) Date:\s*(.+)",
            "expires": r"(?:Expir(?:y|ation)|Registry Expiry) Date:\s*(.+)",
            "updated": r"Updated Date:\s*(.+)",
            "status": r"(?:Domain )?Status:\s*(.+)",
            "nameservers": r"Name Server:\s*(.+)",
            "registrant": r"Registrant (?:Name|Organization):\s*(.+)"
        }

        for field, pattern in patterns.items():
            matches = re.findall(pattern, text, re.IGNORECASE)
            if matches:
                fields[field] = matches if len(matches) > 1 else matches[0].strip()

        return fields


# Usage
whois = WhoisLookup("YOUR_API_KEY")

# Single lookup
result = whois.lookup("example.com", "https://whois.example.com/lookup")
print(f"Registrar: {result.get('registrar')}")
print(f"Expires: {result.get('expires')}")

# Bulk availability check
domains = ["startup-name.com", "my-project.io", "cool-app.dev"]
availability = whois.check_availability(domains, "https://whois.example.com/lookup")
print(f"Available: {availability['available']}")

Domain Monitoring (JavaScript)

class DomainMonitor {
  constructor(apiKey) {
    this.apiKey = apiKey;
    this.watchList = new Map();
  }

  addDomain(domain, whoisUrl) {
    this.watchList.set(domain, { url: whoisUrl, history: [] });
  }

  async checkExpirations() {
    const expiring = [];

    for (const [domain, config] of this.watchList) {
      try {
        const data = await this.lookup(domain, config.url);
        config.history.push({ ...data, checkedAt: new Date().toISOString() });

        if (data.expires) {
          const daysLeft = Math.ceil(
            (new Date(data.expires) - new Date()) / (1000 * 60 * 60 * 24)
          );
          if (daysLeft <= 30) {
            expiring.push({ domain, daysLeft, expires: data.expires });
          }
        }
      } catch (error) {
        console.error(`Failed to check ${domain}: ${error.message}`);
      }
    }

    return expiring;
  }

  async lookup(domain, whoisUrl) {
    const response = await fetch(`${whoisUrl}?domain=${domain}`);
    const html = await response.text();

    if (html.includes('g-recaptcha')) {
      return this.solveAndLookup(domain, whoisUrl, html);
    }

    return this.parseWhois(html);
  }

  async solveAndLookup(domain, whoisUrl, html) {
    const match = html.match(/data-sitekey="([^"]+)"/);
    if (!match) throw new Error('No reCAPTCHA site key found');

    const submitResp = await fetch('https://ocr.captchaai.com/in.php', {
      method: 'POST',
      body: new URLSearchParams({
        key: this.apiKey,
        method: 'userrecaptcha',
        googlekey: match[1],
        pageurl: whoisUrl,
        json: '1'
      })
    });
    const { request: taskId } = await submitResp.json();

    for (let i = 0; i < 60; i++) {
      await new Promise(r => setTimeout(r, 3000));
      const result = await fetch(
        `https://ocr.captchaai.com/res.php?key=${this.apiKey}&action=get&id=${taskId}&json=1`
      );
      const data = await result.json();
      if (data.status === 1) {
        const response = await fetch(whoisUrl, {
          method: 'POST',
          body: new URLSearchParams({
            domain,
            'g-recaptcha-response': data.request
          })
        });
        return this.parseWhois(await response.text());
      }
    }
    throw new Error('reCAPTCHA solve timed out');
  }

  parseWhois(html) {
    const extract = (pattern) => {
      const match = html.match(pattern);
      return match ? match[1].trim() : null;
    };

    return {
      registrar: extract(/Registrar:\s*([^\n<]+)/i),
      created: extract(/Creat(?:ed|ion) Date:\s*([^\n<]+)/i),
      expires: extract(/(?:Expir(?:y|ation)|Registry Expiry) Date:\s*([^\n<]+)/i),
      status: extract(/(?:Domain )?Status:\s*([^\n<]+)/i)
    };
  }
}

// Usage
const monitor = new DomainMonitor('YOUR_API_KEY');
monitor.addDomain('example.com', 'https://whois.example.com/lookup');
monitor.addDomain('mysite.io', 'https://whois.example.com/lookup');

const expiring = await monitor.checkExpirations();
expiring.forEach(d => console.log(`${d.domain} expires in ${d.daysLeft} days`));

WHOIS Query Optimization

Strategy Benefit
Cache results locally Avoid repeated lookups for same domain
Use 3–5 second delays Reduce CAPTCHA trigger rate
Rotate between WHOIS portals Distribute load across providers
Session persistence Maintain CAPTCHA clearance state

Troubleshooting

Issue Cause Fix
CAPTCHA after 3 queries Portal rate limit Increase delay, use proxies
WHOIS returns "No match" Privacy/RDAP redaction Try alternative WHOIS portal
reCAPTCHA token rejected Token expired before submit Solve and submit within 2 minutes
Blocked IP Exceeded daily query limit Rotate residential proxies

FAQ

How many WHOIS lookups can I automate per day?

Most web-based WHOIS portals allow 50–200 queries per IP per day before aggressive rate limiting. With proxy rotation and CaptchaAI handling CAPTCHAs, you can scale to thousands of queries.

Should I use WHOIS protocol (port 43) instead of web portals?

Port 43 WHOIS doesn't have CAPTCHAs but has strict rate limits and limited data due to GDPR redaction. Web portals often show more data behind CAPTCHAs.

Can I monitor domain expiration dates automatically?

Yes. Schedule daily or weekly WHOIS lookups for your watched domains. CaptchaAI handles any CAPTCHAs that appear during the checks.

Next Steps

Automate domain lookups — get your CaptchaAI API key and handle WHOIS portal CAPTCHAs.

Full Working Code

Complete runnable examples for this article in Python, Node.js, PHP, Go, Java, C#, Ruby, Rust, Kotlin & Bash.

View on GitHub →

Discussions (0)

No comments yet.