Troubleshooting

ERROR_PAGEURL: URL Mismatch Troubleshooting Guide

ERROR_PAGEURL means the pageurl parameter doesn't match where the CAPTCHA is loaded. Solvers validate tokens against the origin domain, so the URL must be correct.


Common Causes

Cause Example
Missing protocol example.com instead of https://example.com
Wrong domain www.example.com vs example.com
Redirect changed URL Form at /login redirected to /auth/login
SPA route mismatch JS route /app/login not matching server URL
URL encoding issues Spaces or special chars not encoded
Iframe from different domain CAPTCHA loaded from subdomain

How to Get the Correct URL

Rule: Use the URL in the browser address bar where the CAPTCHA is visible.

# WRONG — incomplete URL
pageurl = "example.com/login"

# WRONG — wrong protocol
pageurl = "http://example.com/login"

# CORRECT — full URL with protocol
pageurl = "https://example.com/login"

# CORRECT — with www if that's what the page uses
pageurl = "https://www.example.com/login"

URL Validation Helper

from urllib.parse import urlparse


def validate_pageurl(url):
    """Validate pageurl before API submission."""
    parsed = urlparse(url)

    if not parsed.scheme:
        raise ValueError(f"Missing protocol: {url}. Use https://")

    if parsed.scheme not in ("http", "https"):
        raise ValueError(f"Invalid protocol: {parsed.scheme}")

    if not parsed.netloc:
        raise ValueError(f"Missing domain: {url}")

    # Remove fragment (hash) — not sent to server
    clean = f"{parsed.scheme}://{parsed.netloc}{parsed.path}"
    if parsed.query:
        clean += f"?{parsed.query}"

    return clean


# Usage
url = validate_pageurl("https://example.com/login#section")
# Returns: "https://example.com/login"

Handling Redirects

import requests


def get_final_url(url):
    """Follow redirects to get the actual page URL."""
    resp = requests.get(url, allow_redirects=True, timeout=15)
    return resp.url


# If the login page redirects
original = "https://example.com/login"
final = get_final_url(original)
print(f"Final URL: {final}")
# Use final URL as pageurl

Handling SPAs (Single Page Applications)

SPAs change the URL via JavaScript without full page loads. The CAPTCHA's domain is what matters:

# For SPAs, use the domain root + the route shown in the address bar
# NOT the API endpoint that the form submits to

# WRONG — API endpoint
pageurl = "https://api.example.com/v1/auth/login"

# CORRECT — the page URL shown in browser
pageurl = "https://example.com/login"

Iframe-Loaded CAPTCHAs

When a CAPTCHA loads inside an iframe from a different domain:

# If the CAPTCHA is on the MAIN page
pageurl = "https://example.com/register"  # Main page URL

# If the CAPTCHA is in an IFRAME with a different domain
# Still use the main page URL, not the iframe src
pageurl = "https://example.com/register"
# NOT: "https://captcha-frame.example.com/challenge"

Correct Submission

import requests

# Validate URL first
pageurl = validate_pageurl("https://example.com/login")

resp = requests.post("https://ocr.captchaai.com/in.php", data={
    "key": "YOUR_API_KEY",
    "method": "userrecaptcha",
    "googlekey": "SITE_KEY",
    "pageurl": pageurl,
    "json": 1,
})
result = resp.json()

if result.get("status") == 1:
    print(f"Task ID: {result['request']}")
else:
    print(f"Error: {result.get('request')}")

Troubleshooting

Issue Cause Fix
Error despite correct-looking URL www vs non-www mismatch Check address bar exactly
URL works sometimes, fails others Page has A/B test URLs Capture URL at solve time
Token solved but rejected by site pageurl domain mismatch Token domain must match site domain
Works in browser but fails in code Redirect not followed Use get_final_url()
URL has query parameters Parameters may be required Include necessary query params

FAQ

Does the URL path matter or just the domain?

The domain is the critical part for token validation. However, provide the full path for best results, as some sites validate the complete URL.

Should I include query parameters?

Include them if they're part of the visible URL. Remove tracking parameters like utm_source that don't affect the page content.

How do I handle URLs that change per session?

Extract the URL dynamically in your automation script. Don't hardcode URLs that include session IDs or tokens.



Get the URL right — solve with CaptchaAI.

Discussions (0)

No comments yet.