Use Cases

CAPTCHA Solver for RPA Automation

Robotic Process Automation (RPA) bots automate repetitive business tasks across web applications — form submissions, data extraction, portal logins. CAPTCHAs are one of the most common failure points for RPA workflows. This guide covers how to integrate a CAPTCHA solver API into different RPA environments.

For a broader overview of CAPTCHA solver use cases, see the CAPTCHA Solver Use Cases guide.

Where CAPTCHAs Break RPA Workflows

RPA workflows hit CAPTCHAs at several predictable points:

Trigger Challenge type typically encountered
Login portals reCAPTCHA v2, hCaptcha, Cloudflare Turnstile
Form submissions reCAPTCHA v2 invisible, hCaptcha
Government / regulatory portals reCAPTCHA v2, image CAPTCHA
Financial platforms reCAPTCHA Enterprise
E-commerce checkout bots hCaptcha, Cloudflare Managed Challenge

Architecture: Where the Solver Fits

The CAPTCHA solver API sits between your bot's page interaction layer and form submission:

RPA Bot
  └─ Opens page
  └─ Detects CAPTCHA widget
  └─ Calls CaptchaAI API → receives token
  └─ Injects token into form field
  └─ Submits form
  └─ Continues workflow

The key requirement: the bot must detect CAPTCHA widgets, route to the solver, and resume only after token injection. In most RPA frameworks, this is implemented as a reusable function or activity.

Python-Based RPA (Playwright)

Python RPA tools (Robot Framework, custom Playwright bots) integrate CaptchaAI most directly:

import requests
import time
from playwright.sync_api import sync_playwright

CAPTCHAI_KEY = "YOUR_API_KEY"

def solve_recaptcha_v2(page_url: str, site_key: str) -> str:
    r = requests.post("https://ocr.captchaai.com/in.php", data={
        "key": CAPTCHAI_KEY,
        "method": "userrecaptcha",
        "googlekey": site_key,
        "pageurl": page_url,
        "json": 1,
    }, timeout=30)
    task_id = r.json()["request"]
    time.sleep(5)
    for _ in range(24):
        r = requests.get("https://ocr.captchaai.com/res.php", params={
            "key": CAPTCHAI_KEY, "action": "get", "id": task_id, "json": 1
        }, timeout=30)
        d = r.json()
        if d.get("status") == 1:
            return d["request"]
        time.sleep(5)
    raise TimeoutError("reCAPTCHA solve timed out")


def rpa_login_with_captcha(
    portal_url: str,
    username: str,
    password: str,
) -> None:
    """Login to a portal that uses reCAPTCHA v2."""
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto(portal_url, wait_until="networkidle")

        page.fill('[name="username"], #username, input[type="email"]', username)
        page.fill('[name="password"], #password, input[type="password"]', password)

        # Detect CAPTCHA
        site_key = page.get_attribute("[data-sitekey]", "data-sitekey")
        if site_key:
            token = solve_recaptcha_v2(portal_url, site_key)
            page.evaluate(f"""
                document.querySelectorAll('[name="g-recaptcha-response"]')
                    .forEach(el => el.value = '{token}');
            """)

        page.click('button[type="submit"], input[type="submit"]')
        page.wait_for_load_state("networkidle")
        # Continue workflow...
        browser.close()

UiPath Integration

UiPath doesn't have a built-in CAPTCHA solver activity. The integration approach:

  1. Use UiPath HTTP Request activity to call the CaptchaAI API directly
  2. Or invoke a Python script via the Invoke Python Method activity

HTTP Request activity approach

In UiPath Studio, add an HTTP Request activity sequence:

Submit task: - URL: https://ocr.captchaai.com/in.php - Method: POST - Body: key=YOUR_KEY&method=userrecaptcha&googlekey={SiteKey}&pageurl={PageURL}&json=1 - Store response in: submitResponse (String)

Parse submitResponse with a Deserialize JSON activity → extract request field as taskId.

Poll: Add a Do-While loop: - HTTP GET to https://ocr.captchaai.com/res.php?key=YOUR_KEY&action=get&id={taskId}&json=1 - Parse response → if status=1, break; else delay 5 seconds - Store token in: captchaToken

Inject: Use Inject JavaScript activity:

document.querySelector('[name="g-recaptcha-response"]').value = '{{captchaToken}}';

Invoke Python Method approach

For complex CAPTCHA types (GeeTest, FunCaptcha), call a Python helper module from UiPath using Invoke Python Method with the solve functions from this guide.

Power Automate (Desktop)

Power Automate Desktop supports custom JavaScript execution via the "Run JavaScript function on web page" action:

  1. Add a "Run JavaScript" action to inject the token after solving:
function ExecuteScript() {
    var field = document.querySelector('[name="g-recaptcha-response"]');
    if (field) field.value = '%CaptchaToken%';
    return 'OK';
}
  1. Call the CaptchaAI API using the "Invoke web service" action (standard HTTP request).

  2. Parse the JSON response using "Convert JSON to custom object" and extract the token.

For Power Automate Cloud, the same pattern works via HTTP connectors and Parse JSON actions.

Robot Framework Integration

*** Settings ***
Library    RequestsLibrary
Library    Browser

*** Variables ***
${CAPTCHAI_KEY}    YOUR_API_KEY
${CAPTCHA_SUBMIT_URL}    https://ocr.captchaai.com/in.php
${CAPTCHA_POLL_URL}      https://ocr.captchaai.com/res.php

*** Keywords ***
Solve reCAPTCHA V2
    [Arguments]    ${page_url}    ${site_key}
    ${params}=    Create Dictionary
    ...    key=${CAPTCHAI_KEY}
    ...    method=userrecaptcha
    ...    googlekey=${site_key}
    ...    pageurl=${page_url}
    ...    json=1
    ${response}=    POST    ${CAPTCHA_SUBMIT_URL}    data=${params}
    ${task_id}=    Set Variable    ${response.json()['request']}
    Sleep    5s
    FOR    ${i}    IN RANGE    24
        ${poll}=    GET    ${CAPTCHA_POLL_URL}    params=key=${CAPTCHAI_KEY}&action=get&id=${task_id}&json=1
        IF    ${poll.json()['status']} == 1
            RETURN    ${poll.json()['request']}
        END
        Sleep    5s
    END
    Fail    CAPTCHA solve timed out

Error Handling in RPA Contexts

RPA workflows need error handling at both the solver and browser layer:

from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeout

def rpa_step_with_captcha_retry(step_func, page, api_key: str, max_retries: int = 3):
    """Wrapper for any RPA step that may encounter a CAPTCHA."""
    for attempt in range(max_retries):
        try:
            # Check for CAPTCHA before executing step
            site_key = page.get_attribute("[data-sitekey]", "data-sitekey")
            if site_key:
                token = solve_recaptcha_v2(page.url, site_key)
                page.evaluate(f"document.querySelector('[name=\"g-recaptcha-response\"]').value='{token}'")

            return step_func(page)
        except (RuntimeError, TimeoutError, PlaywrightTimeout) as e:
            if attempt == max_retries - 1:
                raise
            time.sleep(5 * (attempt + 1))

Production Readiness Notes

Use CAPTCHA Solver for RPA Automation as a decision and implementation aid, not just as a one-time reference. The practical test for captcha solver for rpa automation is whether the same approach behaves reliably when traffic is messy: rotating sessions, expired tokens, changing widget parameters, intermittent solver delays, and target pages that refresh without warning. For RPA developer or automation engineer, the safest rollout is to start with a narrow fixture, record every submitted task, and compare the solver response with the browser state that finally submits the form. That makes failures explainable instead of mysterious, especially when a target alternates between visible challenges, invisible checks, and server-side verification.

Evaluation Criteria

A use-case guide should define where CAPTCHA handling sits in the workflow, what counts as a successful business outcome, and when the automation should stop trying. For automation workflow work, the most useful scorecard combines technical acceptance with operational cost. A low nominal price is not enough if retries double the real cost per accepted token, and a fast median solve time is not enough if p95 latency stalls the queue. Track these criteria before you standardize the workflow:

  • The challenge subtype, sitekey, action, rqdata, blob, captchaId, or page URL used for each task.
  • Median and p95 solve time, separated by provider and target domain.
  • Accepted-token rate on the target page, not just successful API responses.
  • Retry count, timeout count, zero-balance incidents, and invalid-parameter errors.
  • The exact browser, proxy region, and user-agent that submitted the solved token.

Rollout Checklist

Before this guidance moves into a production job, build a small acceptance suite around the pages that matter most. Run it with a fixed browser profile, then repeat with the proxy and concurrency settings you expect in production. Keep the first release conservative: bounded polling, clear timeout handling, and a fallback path when the solver cannot return a usable answer. For automation workflow, tie the solver decision to queue depth, target value, allowed latency, compliance limits, and how easily the workflow can retry or pause. That checklist keeps the article useful after the first copy-paste, because the integration is judged by end-to-end completion rather than by whether a code sample returned a string.

Monitoring Signals

Healthy CAPTCHA automation is observable. Log the task id, provider, challenge type, target host, queue time, solve time, final submit status, and normalized error code for every attempt. Review those logs in daily batches at first, then move to alerts once the baseline is stable. Sudden drops usually come from target-side changes: a new sitekey, a changed action name, a stricter hostname check, an added managed challenge, or a proxy pool that no longer matches the expected geography. When you can see those shifts quickly, provider switching becomes a controlled decision instead of a late-night rewrite.

Maintenance Cadence

Revisit the setup whenever the target UI changes, when the solver provider changes task names or pricing, or when benchmark data shows a sustained latency or solve-rate shift. Keep one known-good fixture for each CAPTCHA subtype and rerun it after dependency upgrades, browser updates, and proxy changes. If the article is used for vendor selection, repeat the same fixture across at least two providers before renewing a balance or migrating the whole pipeline. That habit keeps captcha solver for rpa automation work aligned with the real target behavior rather than with stale assumptions.

Comments are disabled for this article.

Related Posts

hCaptcha How to Solve hCaptcha in Python
Complete Python tutorial for solving h Captcha automatically — covers site key extraction, solver API integration with Captcha AI, token injection using Playwri...

Complete Python tutorial for solving h Captcha automatically — covers site key extraction, solver API integrat...

May 05, 2026
GeeTest GeeTest Slider CAPTCHA Explained
How the Gee Test slider CAPTCHA works under the hood — challenge generation, browser-side trajectory scoring, and why it is harder to automate than it looks.

How the Gee Test slider CAPTCHA works under the hood — challenge generation, browser-side trajectory scoring,...

May 05, 2026
reCAPTCHA How to Solve reCAPTCHA v2 in Python
Complete Python tutorial for solving re CAPTCHA v 2 (checkbox and invisible) automatically — includes site key extraction, solver API integration, token injecti...

Complete Python tutorial for solving re CAPTCHA v 2 (checkbox and invisible) automatically — includes site key...

May 05, 2026