hCaptcha is used by Discord, Cloudflare, and thousands of other sites. It issues image-selection challenges and returns an h-captcha-response token — not a g-recaptcha-response. This tutorial covers the complete Python pipeline.
For background on how hCaptcha works and where it's deployed, see the hCaptcha Guide.
Prerequisites
pip install requests playwright
playwright install chromium
Step 1 — Extract the Site Key
hCaptcha embeds the site key in the page HTML, typically in a <div> with data-sitekey:
import re
import requests
def extract_hcaptcha_sitekey(page_url: str) -> str:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"}
html = requests.get(page_url, headers=headers, timeout=15).text
patterns = [
r'data-sitekey=["\']([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})["\']',
r'"sitekey"\s*:\s*"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})"',
r'hcaptcha\.com/1/api\.js[^>]*data-sitekey=["\']([^"\']+)["\']',
]
for p in patterns:
m = re.search(p, html)
if m:
return m.group(1)
raise ValueError(f"hCaptcha site key not found on {page_url}")
hCaptcha site keys are UUID format: 10000000-ffff-ffff-ffff-000000000001 is the test/demo key.
Step 2 — Submit the hCaptcha Task
import time
def solve_hcaptcha(
api_key: str,
page_url: str,
site_key: str,
) -> str:
"""Submit and poll for hCaptcha token. Returns h-captcha-response."""
# Submit
payload = {
"key": api_key,
"method": "hcaptcha",
"sitekey": site_key,
"pageurl": page_url,
"json": 1,
}
r = requests.post("https://ocr.captchaai.com/in.php", data=payload, timeout=30)
r.raise_for_status()
data = r.json()
if data.get("status") != 1:
raise RuntimeError(f"hCaptcha submit failed: {data}")
task_id = data["request"]
# Poll — hCaptcha typically takes 12–20s
time.sleep(10)
for _ in range(24):
r = requests.get(
"https://ocr.captchaai.com/res.php",
params={"key": api_key, "action": "get", "id": task_id, "json": 1},
timeout=30,
)
data = r.json()
if data.get("status") == 1:
return data["request"] # h-captcha-response token
if data.get("request") not in ("CAPCHA_NOT_READY", "CAPTCHA_NOT_READY"):
raise RuntimeError(f"Unexpected: {data}")
time.sleep(5)
raise TimeoutError("hCaptcha solve timed out")
Note: method=hcaptcha and the sitekey parameter (not googlekey) — these differ from reCAPTCHA.
Step 3 — Inject and Submit with Playwright
from playwright.sync_api import sync_playwright
def submit_hcaptcha_form(page_url: str, api_key: str) -> None:
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto(page_url, wait_until="networkidle")
# Get site key from the live page
site_key = page.get_attribute("[data-sitekey]", "data-sitekey")
if not site_key:
raise ValueError("hCaptcha site key not found in live page")
token = solve_hcaptcha(api_key, page_url, site_key)
# Inject token — hCaptcha uses a textarea, not a regular input
page.evaluate(f"""
// Primary injection point
const textarea = document.querySelector('[name="h-captcha-response"]');
if (textarea) {{ textarea.value = '{token}'; }}
// Some sites also check a g-recaptcha-response field
const gcr = document.querySelector('[name="g-recaptcha-response"]');
if (gcr) {{ gcr.value = '{token}'; }}
// Trigger any onSuccess callbacks
if (window.hcaptcha && window.hcaptcha.execute) {{
window.hcaptcha.execute = () => Promise.resolve('{token}');
}}
""")
page.click('button[type="submit"]')
page.wait_for_load_state("networkidle")
browser.close()
With requests (no browser)
def submit_form_direct(
form_url: str,
api_key: str,
site_key: str,
page_url: str,
extra_fields: dict,
) -> requests.Response:
token = solve_hcaptcha(api_key, page_url, site_key)
payload = {**extra_fields, "h-captcha-response": token}
r = requests.post(form_url, data=payload, timeout=30)
return r
Handling hCaptcha Enterprise
hCaptcha Enterprise uses the same API parameters but may enforce additional behavioral signals. If you receive low pass rates with Enterprise challenges:
- Pass
rqdataif visible — some pages embed anrqdatavalue in the widget HTML that must be passed in the solver request:
# Look for rqdata in the page source
rqdata_match = re.search(r'"rqdata"\s*:\s*"([^"]+)"', html)
if rqdata_match:
payload["rqdata"] = rqdata_match.group(1)
-
Use a matching User-Agent — the solver and your browser session should use the same User-Agent string.
-
Pass the
userAgentparameter to the solver:
payload["userAgent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
Common Errors
| Error | Cause | Fix |
|---|---|---|
ERROR_WRONG_CAPTCHA_ID |
Wrong method used | Ensure method=hcaptcha, not userrecaptcha |
invalid-or-already-seen-response |
Token reused | Generate a new token per submission |
| Token rejected at target | Token expired (> 2 min) | Solve immediately before submit |
ERROR_CAPTCHA_UNSOLVABLE |
Challenge failed | Retry; check if site key is valid |
No h-captcha-response field |
Unusual form structure | Check if field is inside an iframe |
Related Guides
- hCaptcha Guide — how hCaptcha works, where it's used, solver comparison
- CAPTCHA Solving in Python: Quick Start — multi-type overview
- How to Solve reCAPTCHA v2 in Python — g-recaptcha-response pattern
Production Readiness Notes
Use How to Solve hCaptcha in Python as a decision and implementation aid, not just as a one-time reference. The practical test for how to solve hcaptcha python is whether the same approach behaves reliably when traffic is messy: rotating sessions, expired tokens, changing widget parameters, intermittent solver delays, and target pages that refresh without warning. For Automation developer, the safest rollout is to start with a narrow fixture, record every submitted task, and compare the solver response with the browser state that finally submits the form. That makes failures explainable instead of mysterious, especially when a target alternates between visible challenges, invisible checks, and server-side verification.
Evaluation Criteria
A how-to should be exercised against staging first, then promoted with feature flags so failed solves can fall back without blocking the entire workflow. For hCaptcha work, the most useful scorecard combines technical acceptance with operational cost. A low nominal price is not enough if retries double the real cost per accepted token, and a fast median solve time is not enough if p95 latency stalls the queue. Track these criteria before you standardize the workflow:
- The challenge subtype, sitekey, action, rqdata, blob, captchaId, or page URL used for each task.
- Median and p95 solve time, separated by provider and target domain.
- Accepted-token rate on the target page, not just successful API responses.
- Retry count, timeout count, zero-balance incidents, and invalid-parameter errors.
- The exact browser, proxy region, and user-agent that submitted the solved token.
Rollout Checklist
Before this guidance moves into a production job, build a small acceptance suite around the pages that matter most. Run it with a fixed browser profile, then repeat with the proxy and concurrency settings you expect in production. Keep the first release conservative: bounded polling, clear timeout handling, and a fallback path when the solver cannot return a usable answer. For hCaptcha, verify sitekey discovery, rqdata handling, challenge refresh behavior, proxy consistency, and token injection timing. That checklist keeps the article useful after the first copy-paste, because the integration is judged by end-to-end completion rather than by whether a code sample returned a string.
Monitoring Signals
Healthy CAPTCHA automation is observable. Log the task id, provider, challenge type, target host, queue time, solve time, final submit status, and normalized error code for every attempt. Review those logs in daily batches at first, then move to alerts once the baseline is stable. Sudden drops usually come from target-side changes: a new sitekey, a changed action name, a stricter hostname check, an added managed challenge, or a proxy pool that no longer matches the expected geography. When you can see those shifts quickly, provider switching becomes a controlled decision instead of a late-night rewrite.
Maintenance Cadence
Revisit the setup whenever the target UI changes, when the solver provider changes task names or pricing, or when benchmark data shows a sustained latency or solve-rate shift. Keep one known-good fixture for each CAPTCHA subtype and rerun it after dependency upgrades, browser updates, and proxy changes. If the article is used for vendor selection, repeat the same fixture across at least two providers before renewing a balance or migrating the whole pipeline. That habit keeps how to solve hcaptcha python work aligned with the real target behavior rather than with stale assumptions.