GeeTest is a behavior-based CAPTCHA used by Binance, Huobi, and many Asian platforms. It exists in two major versions: v3 (older, three-parameter flow) and v4 (newer, task-based flow). This tutorial covers both.
For background on how GeeTest works, see the GeeTest Guide.
Prerequisites
pip install requests playwright
playwright install chromium
v3 vs v4 at a Glance
| Aspect | GeeTest v3 | GeeTest v4 |
|---|---|---|
| Key parameters | gt, challenge |
captcha_id |
| Dynamic parameter | challenge changes each load |
captcha_id is static |
| Result fields | geetest_validate, geetest_seccode, geetest_challenge |
captcha_id, lot_number, pass_token, gen_time, captcha_output |
| Solver method | geetest |
geetest4 |
v3 requires fetching a new challenge before each solve. v4's captcha_id is static but the result contains more fields.
Solving GeeTest v3
Extract Parameters
v3 requires gt (static) and challenge (dynamic, fetched from the site's API on each page load):
import re
import requests
def extract_geetest_v3_params(page_url: str) -> dict:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"}
html = requests.get(page_url, headers=headers, timeout=15).text
# Static gt (usually in JS or a data attribute)
gt_m = re.search(r'"gt"\s*:\s*"([0-9a-f]{32})"', html)
challenge_m = re.search(r'"challenge"\s*:\s*"([0-9a-f]{32,})"', html)
if not gt_m:
raise ValueError("GeeTest v3 gt not found")
if not challenge_m:
raise ValueError("GeeTest v3 challenge not found — may need to call the init API")
return {"gt": gt_m.group(1), "challenge": challenge_m.group(1)}
Some sites fetch the challenge dynamically via an AJAX request. Use your browser's DevTools → Network tab to find the init endpoint, then replicate it:
def fetch_geetest_challenge(init_url: str, gt: str) -> str:
"""Call the site's GeeTest init endpoint to get a fresh challenge."""
r = requests.get(init_url, params={"gt": gt, "timestamp": int(time.time())}, timeout=15)
return r.json()["challenge"]
Solve v3
import time
def solve_geetest_v3(
api_key: str,
page_url: str,
gt: str,
challenge: str,
) -> dict:
"""Solve GeeTest v3. Returns dict with geetest_validate, geetest_seccode, geetest_challenge."""
payload = {
"key": api_key,
"method": "geetest",
"gt": gt,
"challenge": challenge,
"pageurl": page_url,
"json": 1,
}
r = requests.post("https://ocr.captchaai.com/in.php", data=payload, timeout=30)
r.raise_for_status()
data = r.json()
if data.get("status") != 1:
raise RuntimeError(f"GeeTest v3 submit failed: {data}")
task_id = data["request"]
# GeeTest typically takes 15–25s
time.sleep(15)
for _ in range(24):
r = requests.get(
"https://ocr.captchaai.com/res.php",
params={"key": api_key, "action": "get", "id": task_id, "json": 1},
timeout=30,
)
data = r.json()
if data.get("status") == 1:
# result format: "geetest_validate:...|geetest_seccode:...|geetest_challenge:..."
result = {}
for part in data["request"].split("|"):
k, v = part.split(":", 1)
result[k] = v
return result
if data.get("request") not in ("CAPCHA_NOT_READY", "CAPTCHA_NOT_READY"):
raise RuntimeError(f"Unexpected: {data}")
time.sleep(5)
raise TimeoutError("GeeTest v3 solve timed out")
Inject v3 Result
from playwright.sync_api import sync_playwright
def submit_geetest_v3_form(page_url: str, api_key: str, gt: str, challenge: str) -> None:
result = solve_geetest_v3(api_key, page_url, gt, challenge)
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto(page_url, wait_until="networkidle")
page.evaluate(f"""
// Inject three GeeTest v3 result fields
const validate = document.querySelector('[name="geetest_validate"]');
const seccode = document.querySelector('[name="geetest_seccode"]');
const ch = document.querySelector('[name="geetest_challenge"]');
if (validate) validate.value = '{result["geetest_validate"]}';
if (seccode) seccode.value = '{result["geetest_seccode"]}';
if (ch) ch.value = '{result["geetest_challenge"]}';
// Trigger GeeTest's own success callback if accessible
if (window.captcha && window.captcha.onSuccess) {{
window.captcha.onSuccess({{
geetest_validate: '{result["geetest_validate"]}',
geetest_seccode: '{result["geetest_seccode"]}',
geetest_challenge: '{result["geetest_challenge"]}'
}});
}}
""")
page.click('button[type="submit"]')
page.wait_for_load_state("networkidle")
browser.close()
Solving GeeTest v4
Extract the captcha_id
v4 uses a static captcha_id (UUID format) embedded in the page:
def extract_geetest_v4_captcha_id(page_url: str) -> str:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"}
html = requests.get(page_url, headers=headers, timeout=15).text
m = re.search(r'"captcha_id"\s*:\s*"([a-f0-9]{32})"', html)
if m:
return m.group(1)
raise ValueError("GeeTest v4 captcha_id not found")
Solve v4
def solve_geetest_v4(
api_key: str,
page_url: str,
captcha_id: str,
) -> dict:
"""Solve GeeTest v4. Returns dict with captcha_id, lot_number, pass_token, gen_time, captcha_output."""
payload = {
"key": api_key,
"method": "geetest4",
"captcha_id": captcha_id,
"pageurl": page_url,
"json": 1,
}
r = requests.post("https://ocr.captchaai.com/in.php", data=payload, timeout=30)
r.raise_for_status()
data = r.json()
if data.get("status") != 1:
raise RuntimeError(f"GeeTest v4 submit failed: {data}")
task_id = data["request"]
time.sleep(15)
for _ in range(24):
r = requests.get(
"https://ocr.captchaai.com/res.php",
params={"key": api_key, "action": "get", "id": task_id, "json": 1},
timeout=30,
)
data = r.json()
if data.get("status") == 1:
# v4 result is a JSON string
import json
return json.loads(data["request"])
if data.get("request") not in ("CAPCHA_NOT_READY", "CAPTCHA_NOT_READY"):
raise RuntimeError(f"Unexpected: {data}")
time.sleep(5)
raise TimeoutError("GeeTest v4 solve timed out")
Inject v4 Result
v4 returns 5 fields — all must be submitted:
def submit_geetest_v4_form(page_url: str, api_key: str, captcha_id: str) -> None:
result = solve_geetest_v4(api_key, page_url, captcha_id)
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto(page_url, wait_until="networkidle")
page.evaluate(f"""
const fields = {{
captcha_id: '{result["captcha_id"]}',
lot_number: '{result["lot_number"]}',
pass_token: '{result["pass_token"]}',
gen_time: '{result["gen_time"]}',
captcha_output: '{result["captcha_output"]}'
}};
Object.entries(fields).forEach(([name, value]) => {{
const el = document.querySelector(`[name="${{name}}"]`);
if (el) el.value = value;
}});
""")
page.click('button[type="submit"]')
page.wait_for_load_state("networkidle")
browser.close()
Common Errors
| Error | Cause | Fix |
|---|---|---|
ERROR_WRONG_CAPTCHA_ID |
Wrong method name | Use geetest for v3, geetest4 for v4 |
| Challenge expired | Using a stale challenge value |
Fetch a fresh challenge before each solve |
| All 3 v3 fields not submitted | Missing fields | Submit all three: validate, seccode, challenge |
| v4 token rejected | Not all 5 fields submitted | Submit all five v4 fields |
| Timeout on first poll | GeeTest takes 20+ seconds | Start polling at 15s not 5s |
Related Guides
- GeeTest Guide — how GeeTest v3 and v4 work, solver comparison
- CAPTCHA Solving in Python: Quick Start — multi-type pipeline overview
Production Readiness Notes
Use How to Solve GeeTest CAPTCHA in Python (v3 & v4) as a decision and implementation aid, not just as a one-time reference. The practical test for how to solve geetest captcha python is whether the same approach behaves reliably when traffic is messy: rotating sessions, expired tokens, changing widget parameters, intermittent solver delays, and target pages that refresh without warning. For Automation developer, the safest rollout is to start with a narrow fixture, record every submitted task, and compare the solver response with the browser state that finally submits the form. That makes failures explainable instead of mysterious, especially when a target alternates between visible challenges, invisible checks, and server-side verification.
Evaluation Criteria
A how-to should be exercised against staging first, then promoted with feature flags so failed solves can fall back without blocking the entire workflow. For GeeTest work, the most useful scorecard combines technical acceptance with operational cost. A low nominal price is not enough if retries double the real cost per accepted token, and a fast median solve time is not enough if p95 latency stalls the queue. Track these criteria before you standardize the workflow:
- The challenge subtype, sitekey, action, rqdata, blob, captchaId, or page URL used for each task.
- Median and p95 solve time, separated by provider and target domain.
- Accepted-token rate on the target page, not just successful API responses.
- Retry count, timeout count, zero-balance incidents, and invalid-parameter errors.
- The exact browser, proxy region, and user-agent that submitted the solved token.
Rollout Checklist
Before this guidance moves into a production job, build a small acceptance suite around the pages that matter most. Run it with a fixed browser profile, then repeat with the proxy and concurrency settings you expect in production. Keep the first release conservative: bounded polling, clear timeout handling, and a fallback path when the solver cannot return a usable answer. For GeeTest, refresh gt, challenge, and captchaId parameters for every attempt, because stale GeeTest values are the most common source of false failures. That checklist keeps the article useful after the first copy-paste, because the integration is judged by end-to-end completion rather than by whether a code sample returned a string.
Monitoring Signals
Healthy CAPTCHA automation is observable. Log the task id, provider, challenge type, target host, queue time, solve time, final submit status, and normalized error code for every attempt. Review those logs in daily batches at first, then move to alerts once the baseline is stable. Sudden drops usually come from target-side changes: a new sitekey, a changed action name, a stricter hostname check, an added managed challenge, or a proxy pool that no longer matches the expected geography. When you can see those shifts quickly, provider switching becomes a controlled decision instead of a late-night rewrite.
Maintenance Cadence
Revisit the setup whenever the target UI changes, when the solver provider changes task names or pricing, or when benchmark data shows a sustained latency or solve-rate shift. Keep one known-good fixture for each CAPTCHA subtype and rerun it after dependency upgrades, browser updates, and proxy changes. If the article is used for vendor selection, repeat the same fixture across at least two providers before renewing a balance or migrating the whole pipeline. That habit keeps how to solve geetest captcha python work aligned with the real target behavior rather than with stale assumptions.