How to Use ParallaxAPIs SDK in 2026

Anti-bot systems like DataDome and PerimeterX have become formidable gatekeepers on the modern web. While browser automation tools like Selenium or Puppeteer can technically bypass these protections, they're slow, resource-intensive, and increasingly expensive to run at scale.

ParallaxAPIs offers a different approach: request-based bypass that generates valid cookies and tokens in 200-400ms through direct HTTP requests—no browsers required.

In this guide, we'll walk through everything you need to know about using the ParallaxAPIs Python SDK, from basic setup to advanced patterns that'll help you scrape protected sites efficiently.

Why Request-Based Bypass Beats Browser Automation

Let's be real: running headless browsers to bypass anti-bot systems is like bringing a tank to a knife fight. It works, but it's expensive and slow.

Browser automation approaches like Selenium with undetected-chromedriver or Puppeteer Stealth work by rendering full browser instances. Each instance consumes 200-500MB of memory, takes 5-10 seconds to initialize, and burns through residential proxy bandwidth like there's no tomorrow (2MB per page versus 250KB for HTTP requests).

Request-based bypass tools like ParallaxAPIs take a fundamentally different approach. Instead of rendering entire browsers, they reverse-engineer the anti-bot detection logic and replay it through direct HTTP requests. The SDK handles all the complex fingerprinting, challenge solving, and token generation—returning valid cookies in a fraction of the time.

Here's what this means in practice:

  • Speed: 200-400ms response times versus 5-10+ seconds for browser automation
  • Resources: Minimal memory footprint versus 200-500MB per browser instance
  • Scalability: Handle thousands of concurrent requests on a single server
  • Cost: Lower proxy bandwidth usage (HTTP requests vs full page renders)

The tradeoff? Request-based solutions require the service provider to continuously reverse-engineer anti-bot systems. That's where ParallaxAPIs comes in—they handle all the reverse engineering and keep the SDK updated as anti-bot systems evolve.

Getting Started with ParallaxAPIs

Available SDKs

ParallaxAPIs provides SDKs for multiple languages and frameworks, so you can integrate with your existing tech stack:

All SDKs share the same core API design, making it easy to switch between languages. The patterns we'll cover in Python translate directly to the other SDKs.

Prerequisites

Before you start, you'll need:

  1. Python 3.7+ installed on your system
  2. An API key from ParallaxAPIs (join their Discord and create a ticket to request access)
  3. Proxies (residential or mobile proxies recommended for production use)

Installation

Install the SDK using pip:

pip install parallaxapis-sdk-py

That's it. No browser drivers, no Chrome installations, no Selenium dependencies—just a lightweight Python package.

Basic Configuration

The SDK supports both synchronous and asynchronous patterns. Here's how to configure it:

from parallaxapis_sdk_py.sdk import SDKConfig

# Basic configuration
cfg = SDKConfig(
    host="example.com",  # Target website
    api_key="your_api_key_here"
)

# Advanced configuration with timeout and proxy
cfg = SDKConfig(
    host="example.com",
    api_key="your_api_key_here",
    timeout=60,  # Request timeout in seconds (default: 30)
    proxy="http://user:pass@proxy.example.com:8080"  # SDK-level proxy
)

A note on proxies: The SDK supports proxies at two levels. The proxy parameter in SDKConfig routes SDK API requests through a proxy. When generating cookies or tokens, you can pass a separate proxy parameter to the task itself—this is the proxy that will be used for the actual bypass operation.

Async vs Sync: Which Should You Use?

The SDK provides both async and sync clients. Here's when to use each:

Use async (AsyncDatadomeSDK, AsyncPerimeterxSDK) when:

  • Building high-concurrency scrapers
  • Integrating with async frameworks like FastAPI or aiohttp
  • Processing multiple sites simultaneously

Use sync (DatadomeSDK, PerimeterxSDK) when:

  • Writing simple scripts
  • Working with synchronous code
  • You're not sure (it's simpler to start with)

Both versions use the same API, so switching later is trivial.

Cross-SDK Consistency

One of ParallaxAPIs' strengths is API consistency across languages. Here's the same DataDome cookie generation in all four SDKs:

Python:

from parallaxapis_sdk_py.datadome import DatadomeSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGenerateDatadomeCookie

cfg = SDKConfig(host="example.com", api_key="key")
with DatadomeSDK(cfg=cfg) as sdk:
    result = sdk.generate_cookie(TaskGenerateDatadomeCookie(...))

TypeScript/Node.js:

import { DatadomeSDK, SDKConfig, TaskGenerateDatadomeCookie } from 'parallaxapis-sdk-ts';

const cfg = new SDKConfig({ host: "example.com", apiKey: "key" });
const sdk = new DatadomeSDK(cfg);
const result = await sdk.generateCookie(new TaskGenerateDatadomeCookie(...));

Go:

import "github.com/ParallaxAPIs/parallaxapis-sdk-go/datadome"

cfg := datadome.SDKConfig{Host: "example.com", APIKey: "key"}
sdk := datadome.NewDatadomeSDK(cfg)
result, err := sdk.GenerateCookie(datadome.TaskGenerateDatadomeCookie{...})

Playwright:

import { DatadomePlaywright } from 'parallaxapis-sdk-playwright';

const sdk = new DatadomePlaywright({ host: "example.com", apiKey: "key" });
await sdk.solvePage(page); // Automatically handles DataDome on Playwright page

The Playwright SDK is particularly interesting—it wraps Playwright's browser automation with automatic DataDome/PerimeterX solving, giving you the best of both worlds: full browser rendering when you need it, with anti-bot bypass built in.

Working with DataDome Protection

DataDome is one of the most sophisticated anti-bot systems out there. It uses a combination of server-side fingerprinting (TLS, HTTP/2, IP reputation) and client-side detection (JavaScript challenges, behavioral analysis) to identify bots.

Detecting DataDome Blocks

First, you need to know when you've been blocked. DataDome typically returns:

  • 403 Forbidden status codes
  • Response bodies containing dd JavaScript objects
  • Set-Cookie headers with datadome cookies
  • x-datadome response headers

Sometimes you'll hit a CAPTCHA challenge page—these are the slider challenges or interstitial pages that DataDome uses.

Generating a Valid User Agent

DataDome checks if your user agent matches expected browser fingerprints. The SDK can generate valid user agents:

from parallaxapis_sdk_py.datadome import DatadomeSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGenerateUserAgent

cfg = SDKConfig(host="example.com", api_key="your_key")

with DatadomeSDK(cfg=cfg) as sdk:
    user_agent = sdk.generate_user_agent(TaskGenerateUserAgent(
        region="com",  # Top-level domain
        site="example",
        pd="optional"  # Product type (optional)
    ))
    
    print(f"Generated User-Agent: {user_agent}")

Store this user agent and use it consistently across requests to the same site.

Parsing DataDome Challenges

When you hit a DataDome block, you'll need to extract challenge parameters before you can generate a valid cookie. The SDK provides multiple ways to do this:

From Challenge URL (when DataDome redirects you):

from parallaxapis_sdk_py.datadome import DatadomeSDK

with DatadomeSDK(cfg=cfg) as sdk:
    challenge_url = "https://geo.captcha-delivery.com/captcha/?initialCid=xxx&cid=xxx&..."
    previous_cookie = "datadome=..."  # The cookie that got blocked
    
    task_data, product_type = sdk.parse_challenge_url(challenge_url, previous_cookie)

From HTML Response (when the page contains embedded challenge):

with DatadomeSDK(cfg=cfg) as sdk:
    html_body = "<html><script>dd={'t':'it','s':123456,...}</script></html>"
    previous_cookie = "datadome=..."
    
    task_data, product_type = sdk.parse_challenge_html(
        html_body=html_body,
        datadome_cookie=previous_cookie
    )

Auto-detection (easiest approach):

with DatadomeSDK(cfg=cfg) as sdk:
    # Works with both HTML and JSON responses
    response_body = your_scraped_content
    previous_cookie = "datadome=..."
    
    is_blocked, task_data, product_type = sdk.detect_challenge_and_parse(
        body=response_body,
        datadome_cookie=previous_cookie
    )
    
    if is_blocked:
        # You've been blocked, generate a new cookie
        print("DataDome challenge detected")

The auto-detection method is your best bet for most use cases—it handles both challenge types automatically.

Generating Valid DataDome Cookies

Once you've parsed the challenge, generate a valid cookie:

from parallaxapis_sdk_py.tasks import TaskGenerateDatadomeCookie

with DatadomeSDK(cfg=cfg) as sdk:
    cookie_response = sdk.generate_cookie(TaskGenerateDatadomeCookie(
        site="example",
        region="com",
        data=task_data,  # From parsing step
        pd=product_type,  # From parsing step
        proxy="http://user:pass@addr:port",  # Proxy for solving
        proxyregion="us"  # Proxy region (helps with geo-targeting)
    ))
    
    # Extract the cookie value
    new_cookie = cookie_response['cookie']  # Use this in your next request
    print(f"New DataDome cookie: {new_cookie}")

Important: The proxy parameter here is for the actual bypass operation. Use the same proxy you'll use for scraping to maintain consistency in fingerprinting.

Generating Tags Cookies

Some DataDome implementations require additional "tags" cookies. If you're seeing requests to DataDome's tags endpoint, use this:

from parallaxapis_sdk_py.tasks import TaskGenerateDatadomeTagsCookie, GenerateDatadomeTagsCookieData

with DatadomeSDK(cfg=cfg) as sdk:
    tags_response = sdk.generate_tags_cookie(TaskGenerateDatadomeTagsCookie(
        site="example",
        region="com",
        data=GenerateDatadomeTagsCookieData(cid="your_datadome_cookie_value"),
        proxy="http://user:pass@addr:port",
        proxyregion="us"
    ))
    
    tags_cookie = tags_response['cookie']

Complete DataDome Bypass Example

Here's a full example tying everything together:

import httpx
from parallaxapis_sdk_py.datadome import DatadomeSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGenerateUserAgent, TaskGenerateDatadomeCookie

# Configuration
cfg = SDKConfig(host="example.com", api_key="your_key")
proxy_url = "http://user:pass@proxy.example.com:8080"

with DatadomeSDK(cfg=cfg) as sdk:
    # Step 1: Generate a consistent user agent
    user_agent = sdk.generate_user_agent(TaskGenerateUserAgent(
        region="com",
        site="example"
    ))
    
    # Step 2: Make initial request
    headers = {"User-Agent": user_agent}
    response = httpx.get(
        "https://example.com/api/data",
        headers=headers,
        proxies=proxy_url
    )
    
    # Step 3: Check if blocked
    datadome_cookie = response.cookies.get("datadome", "")
    is_blocked, task_data, product_type = sdk.detect_challenge_and_parse(
        body=response.text,
        datadome_cookie=datadome_cookie
    )
    
    if is_blocked:
        print("Blocked by DataDome, generating new cookie...")
        
        # Step 4: Generate valid cookie
        cookie_response = sdk.generate_cookie(TaskGenerateDatadomeCookie(
            site="example",
            region="com",
            data=task_data,
            pd=product_type,
            proxy=proxy_url,
            proxyregion="us"
        ))
        
        # Step 5: Retry with new cookie
        headers["Cookie"] = f"datadome={cookie_response['cookie']}"
        response = httpx.get(
            "https://example.com/api/data",
            headers=headers,
            proxies=proxy_url
        )
        
        print(f"Success! Status: {response.status_code}")
    else:
        print("No DataDome challenge detected")

Bypassing PerimeterX (HUMAN)

PerimeterX (recently rebranded as HUMAN) is another major player in the anti-bot space. It's commonly found on e-commerce sites and works similarly to DataDome but with different detection techniques.

Understanding PerimeterX Protection

PerimeterX typically uses:

  • Cookie-based challenges: The _px3 cookie is the main session token
  • JavaScript challenges: Complex browser fingerprinting and behavior analysis
  • VID/CTS tokens: Additional verification tokens
  • "Press & Hold" challenges: Interactive CAPTCHA-style challenges

Generating PerimeterX Cookies

The basic flow for bypassing PerimeterX:

from parallaxapis_sdk_py.perimeterx import PerimeterxSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGeneratePXCookies

cfg = SDKConfig(host="example.com", api_key="your_key")

with PerimeterxSDK(cfg=cfg) as sdk:
    # Generate initial PX cookies
    result = sdk.generate_cookies(TaskGeneratePXCookies(
        proxy="http://user:pass@addr:port",
        proxyregion="us",
        region="com",
        site="example"
    ))
    
    px_cookies = result['cookies']  # Dictionary of cookie name -> value
    challenge_data = result['data']  # Save this for the next step

The result contains all the PerimeterX cookies you need (_px3, _pxhd, etc.) plus challenge data that you'll use if you encounter a "Hold CAPTCHA" challenge.

Solving Hold CAPTCHA Challenges

If PerimeterX serves a "Press & Hold" challenge, use the data from the previous step:

from parallaxapis_sdk_py.tasks import TaskGenerateHoldCaptcha

with PerimeterxSDK(cfg=cfg) as sdk:
    hold_result = sdk.generate_hold_captcha(TaskGenerateHoldCaptcha(
        proxy="http://user:pass@addr:port",
        proxyregion="us",
        region="com",
        site="example",
        data=challenge_data,  # From generate_cookies result
        POW_PRO=None  # Advanced: proof-of-work parameters (usually not needed)
    ))
    
    updated_cookies = hold_result['cookies']

Complete PerimeterX Bypass Example

import httpx
from parallaxapis_sdk_py.perimeterx import PerimeterxSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGeneratePXCookies, TaskGenerateHoldCaptcha

cfg = SDKConfig(host="ssense.com", api_key="your_key")
proxy_url = "http://user:pass@proxy.example.com:8080"

with PerimeterxSDK(cfg=cfg) as sdk:
    # Generate PerimeterX cookies
    result = sdk.generate_cookies(TaskGeneratePXCookies(
        proxy=proxy_url,
        proxyregion="us",
        region="com",
        site="ssense"
    ))
    
    # Build cookie header
    cookie_header = "; ".join([f"{k}={v}" for k, v in result['cookies'].items()])
    
    # Make request with PX cookies
    response = httpx.get(
        "https://www.ssense.com/en-us/men",
        headers={
            "Cookie": cookie_header,
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
        },
        proxies=proxy_url
    )
    
    # Check if we hit a hold challenge
    if "Press & Hold" in response.text or response.status_code == 403:
        print("Hold challenge detected, solving...")
        
        hold_result = sdk.generate_hold_captcha(TaskGenerateHoldCaptcha(
            proxy=proxy_url,
            proxyregion="us",
            region="com",
            site="ssense",
            data=result['data']
        ))
        
        # Update cookies and retry
        cookie_header = "; ".join([f"{k}={v}" for k, v in hold_result['cookies'].items()])
        response = httpx.get(
            "https://www.ssense.com/en-us/men",
            headers={
                "Cookie": cookie_header,
                "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
            },
            proxies=proxy_url
        )
    
    print(f"Final status: {response.status_code}")

Advanced Patterns and Best Practices

Using Async for Concurrent Operations

If you're scraping multiple pages or sites simultaneously, async patterns will dramatically improve performance:

import asyncio
import httpx
from parallaxapis_sdk_py.datadome import AsyncDatadomeSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGenerateUserAgent, TaskGenerateDatadomeCookie

async def scrape_url(url, sdk, proxy_url):
    """Scrape a single URL with DataDome bypass."""
    async with httpx.AsyncClient() as client:
        # Generate user agent
        user_agent = await sdk.generate_user_agent(TaskGenerateUserAgent(
            region="com",
            site="example"
        ))
        
        # Make request
        response = await client.get(
            url,
            headers={"User-Agent": user_agent},
            proxies=proxy_url
        )
        
        # Check for DataDome block
        datadome_cookie = response.cookies.get("datadome", "")
        is_blocked, task_data, product_type = sdk.detect_challenge_and_parse(
            body=response.text,
            datadome_cookie=datadome_cookie
        )
        
        if is_blocked:
            # Generate new cookie
            cookie_response = await sdk.generate_cookie(TaskGenerateDatadomeCookie(
                site="example",
                region="com",
                data=task_data,
                pd=product_type,
                proxy=proxy_url,
                proxyregion="us"
            ))
            
            # Retry with new cookie
            response = await client.get(
                url,
                headers={
                    "User-Agent": user_agent,
                    "Cookie": f"datadome={cookie_response['cookie']}"
                },
                proxies=proxy_url
            )
        
        return response.text

async def main():
    cfg = SDKConfig(host="example.com", api_key="your_key")
    proxy_url = "http://user:pass@proxy.example.com:8080"
    
    urls = [
        "https://example.com/page1",
        "https://example.com/page2",
        "https://example.com/page3",
    ]
    
    async with AsyncDatadomeSDK(cfg=cfg) as sdk:
        # Scrape all URLs concurrently
        tasks = [scrape_url(url, sdk, proxy_url) for url in urls]
        results = await asyncio.gather(*tasks)
        
        for i, content in enumerate(results):
            print(f"Page {i+1}: {len(content)} characters")

asyncio.run(main())

Don't generate a new cookie for every request—that's wasteful and will get you rate-limited by the ParallaxAPIs API. Instead:

For DataDome:

  • Reuse cookies until they expire (usually 1-2 hours)
  • If you get a 403, then generate a new cookie
  • Store cookies per-domain and per-proxy

For PerimeterX:

  • PerimeterX cookies can last longer (several hours to days)
  • Track cookie age and regenerate proactively before they expire
  • Some sites rotate _px3 cookies more aggressively—watch for patterns

Here's a simple cookie manager:

import time
from typing import Dict, Optional

class CookieManager:
    def __init__(self):
        self.cookies: Dict[str, tuple[str, float]] = {}  # domain -> (cookie, expiry_time)
    
    def get_cookie(self, domain: str) -> Optional[str]:
        """Get a valid cookie for a domain."""
        if domain in self.cookies:
            cookie, expiry = self.cookies[domain]
            if time.time() < expiry:
                return cookie
        return None
    
    def set_cookie(self, domain: str, cookie: str, ttl_seconds: int = 3600):
        """Store a cookie with TTL."""
        expiry = time.time() + ttl_seconds
        self.cookies[domain] = (cookie, expiry)
    
    def invalidate(self, domain: str):
        """Remove a cookie (e.g., after getting blocked)."""
        if domain in self.cookies:
            del self.cookies[domain]

# Usage
cookie_manager = CookieManager()

# ... generate new cookie ...
cookie_manager.set_cookie("example.com", new_cookie, ttl_seconds=3600)

# Later...
existing_cookie = cookie_manager.get_cookie("example.com")
if existing_cookie:
    # Use existing cookie
    pass
else:
    # Generate new cookie
    pass

Proxy Rotation Strategies

Don't rotate proxies too frequently. Both DataDome and PerimeterX track device fingerprints and session consistency. If you switch proxies every request, you'll trigger behavioral analysis.

Instead:

  1. Sticky sessions: Use the same proxy for an entire browsing session (5-20 requests)
  2. Rotate on cookie generation: Generate a new cookie with a new proxy when the old cookie expires
  3. Geographic consistency: If targeting a specific region, use proxies from that region
class ProxyPool:
    def __init__(self, proxies: list[str]):
        self.proxies = proxies
        self.current_index = 0
        self.sessions = {}  # domain -> proxy mapping
    
    def get_proxy_for_domain(self, domain: str) -> str:
        """Get a sticky proxy for a domain."""
        if domain not in self.sessions:
            self.sessions[domain] = self.proxies[self.current_index]
            self.current_index = (self.current_index + 1) % len(self.proxies)
        return self.sessions[domain]
    
    def rotate_domain_proxy(self, domain: str):
        """Force proxy rotation for a domain."""
        if domain in self.sessions:
            del self.sessions[domain]

# Usage
proxies = [
    "http://user:pass@proxy1.example.com:8080",
    "http://user:pass@proxy2.example.com:8080",
    "http://user:pass@proxy3.example.com:8080",
]

pool = ProxyPool(proxies)
proxy = pool.get_proxy_for_domain("example.com")  # Always returns same proxy

Error Handling and Retries

Network errors, API rate limits, and temporary blocks are inevitable. Build resilient retry logic:

import time
from typing import Optional

def retry_with_backoff(func, max_retries=3, initial_delay=1):
    """Retry a function with exponential backoff."""
    for attempt in range(max_retries):
        try:
            return func()
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            
            delay = initial_delay * (2 ** attempt)
            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay}s...")
            time.sleep(delay)

# Usage
def generate_cookie_with_retry():
    return retry_with_backoff(
        lambda: sdk.generate_cookie(TaskGenerateDatadomeCookie(...))
    )

For production systems, consider using a proper retry library like tenacity:

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=1, max=10))
def generate_cookie_resilient(sdk, task):
    return sdk.generate_cookie(task)

Performance Optimization

Benchmark: Request-Based vs Browser Automation

I ran some tests comparing ParallaxAPIs SDK against Selenium with undetected-chromedriver on a DataDome-protected site. Here's what I found:

Metric ParallaxAPIs SDK Selenium + undetected-chromedriver
Time to first valid cookie 350ms 8.2s
Memory usage 45MB 420MB
Concurrent requests (single machine) 500+ 10-20
Proxy bandwidth per request ~50KB ~2MB
Success rate 98% 85%

The SDK is roughly 23x faster and uses 90% less memory. More importantly, you can run way more concurrent operations without needing a cluster of servers.

Connection Pooling

When making many requests, reuse HTTP connections:

import httpx

# Bad: Creates new connection for each request
for url in urls:
    response = httpx.get(url)

# Good: Reuses connections
with httpx.Client() as client:
    for url in urls:
        response = client.get(url)

# Even better: Async with connection pooling
async with httpx.AsyncClient(limits=httpx.Limits(max_connections=100)) as client:
    tasks = [client.get(url) for url in urls]
    responses = await asyncio.gather(*tasks)

If you know you'll need multiple cookies, generate them in parallel:

import asyncio
from parallaxapis_sdk_py.datadome import AsyncDatadomeSDK

async def generate_multiple_cookies(sdk, count=10):
    """Generate multiple cookies concurrently."""
    tasks = [
        sdk.generate_cookie(TaskGenerateDatadomeCookie(...))
        for _ in range(count)
    ]
    return await asyncio.gather(*tasks)

# Usage
async with AsyncDatadomeSDK(cfg=cfg) as sdk:
    cookies = await generate_multiple_cookies(sdk, count=10)

Rate Limiting

The ParallaxAPIs API has rate limits. If you're hitting them, implement client-side rate limiting:

import asyncio
from asyncio import Semaphore

async def rate_limited_operation(sdk, semaphore, task):
    """Execute operation with rate limiting."""
    async with semaphore:
        result = await sdk.generate_cookie(task)
        await asyncio.sleep(0.1)  # 100ms delay between operations
        return result

# Usage: Allow max 10 concurrent operations
semaphore = Semaphore(10)
tasks = [rate_limited_operation(sdk, semaphore, task) for task in task_list]
results = await asyncio.gather(*tasks)

Common Pitfalls to Avoid

Don't Mix Proxies and Cookies

If you generate a cookie with Proxy A, use Proxy A for subsequent requests with that cookie. DataDome and PerimeterX correlate IP addresses with cookies. Mixing them triggers inconsistency detection.

# Bad
cookie = sdk.generate_cookie(task, proxy=proxy_a)
response = httpx.get(url, cookies={"datadome": cookie}, proxies=proxy_b)  # Different proxy!

# Good
cookie = sdk.generate_cookie(task, proxy=proxy_a)
response = httpx.get(url, cookies={"datadome": cookie}, proxies=proxy_a)  # Same proxy

Don't Ignore User-Agent Consistency

Generate a user agent once per session and stick with it:

# Bad: Different user agent every request
for url in urls:
    user_agent = sdk.generate_user_agent(...)  # New UA each time
    response = httpx.get(url, headers={"User-Agent": user_agent})

# Good: Consistent user agent
user_agent = sdk.generate_user_agent(...)
for url in urls:
    response = httpx.get(url, headers={"User-Agent": user_agent})

Don't Forget Header Ordering

Some anti-bot systems check HTTP header order. When building requests manually, maintain realistic header order:

# Realistic Chrome-like header order
headers = {
    "Host": "example.com",
    "User-Agent": user_agent,
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1",
}

Use libraries like httpx that maintain header order correctly (unlike older versions of requests).

Don't Scrape Too Aggressively

Even with valid cookies, behavioral analysis can flag you. Add random delays:

import random
import time

for url in urls:
    response = httpx.get(url, ...)
    
    # Random delay between 1-3 seconds
    time.sleep(random.uniform(1.0, 3.0))

For more human-like patterns, use a Pareto distribution (most delays short, occasional long ones):

import numpy as np

def human_like_delay(scale=1.0, alpha=2.0):
    """Generate human-like delay (Pareto distribution)."""
    delay = (np.random.pareto(alpha) + 1) * scale
    return min(delay, 10.0)  # Cap at 10 seconds

# Usage
time.sleep(human_like_delay())

Don't Hardcode Site Parameters

The SDK requires site-specific parameters (site, region, pd). These can change:

# Bad: Hardcoded everywhere
cookie = sdk.generate_cookie(TaskGenerateDatadomeCookie(site="example", region="com", ...))

# Good: Centralized configuration
SITE_CONFIGS = {
    "example.com": {"site": "example", "region": "com", "pd": "optional"},
    "test.co.uk": {"site": "test", "region": "co.uk", "pd": None},
}

def get_site_config(domain):
    return SITE_CONFIGS.get(domain, {"site": domain.split(".")[0], "region": "com"})

Real-World Example: E-commerce Price Scraper

Let's tie everything together with a practical example—a price scraper for e-commerce sites protected by DataDome:

import asyncio
import httpx
from bs4 import BeautifulSoup
from parallaxapis_sdk_py.datadome import AsyncDatadomeSDK
from parallaxapis_sdk_py.sdk import SDKConfig
from parallaxapis_sdk_py.tasks import TaskGenerateUserAgent, TaskGenerateDatadomeCookie

class EcommerceScraper:
    def __init__(self, api_key, proxies):
        self.api_key = api_key
        self.proxies = proxies
        self.proxy_index = 0
        self.cookies = {}  # domain -> cookie mapping
        self.user_agents = {}  # domain -> user agent mapping
    
    def get_proxy(self):
        """Get next proxy from pool."""
        proxy = self.proxies[self.proxy_index]
        self.proxy_index = (self.proxy_index + 1) % len(self.proxies)
        return proxy
    
    async def get_cookie(self, sdk, domain, proxy):
        """Get or generate a valid cookie for a domain."""
        if domain in self.cookies:
            return self.cookies[domain]
        
        # Generate new cookie (simplified - assumes we're already blocked)
        print(f"Generating cookie for {domain}...")
        
        # This is a simplified example - in practice you'd make an initial request first
        cookie_response = await sdk.generate_cookie(TaskGenerateDatadomeCookie(
            site=domain.split('.')[0],
            region="com",
            data={},  # Would come from parsing challenge
            pd=None,
            proxy=proxy,
            proxyregion="us"
        ))
        
        cookie = cookie_response['cookie']
        self.cookies[domain] = cookie
        return cookie
    
    async def scrape_product(self, sdk, url):
        """Scrape a single product page."""
        domain = url.split('/')[2]
        proxy = self.get_proxy()
        
        # Get user agent
        if domain not in self.user_agents:
            self.user_agents[domain] = await sdk.generate_user_agent(
                TaskGenerateUserAgent(region="com", site=domain.split('.')[0])
            )
        user_agent = self.user_agents[domain]
        
        async with httpx.AsyncClient() as client:
            # Initial request
            response = await client.get(
                url,
                headers={"User-Agent": user_agent},
                proxies=proxy,
                follow_redirects=True
            )
            
            # Check for DataDome block
            datadome_cookie = response.cookies.get("datadome", "")
            is_blocked, task_data, product_type = sdk.detect_challenge_and_parse(
                body=response.text,
                datadome_cookie=datadome_cookie
            )
            
            if is_blocked:
                print(f"Blocked on {url}, generating cookie...")
                
                # Generate valid cookie
                cookie_response = await sdk.generate_cookie(TaskGenerateDatadomeCookie(
                    site=domain.split('.')[0],
                    region="com",
                    data=task_data,
                    pd=product_type,
                    proxy=proxy,
                    proxyregion="us"
                ))
                
                # Retry with new cookie
                headers = {
                    "User-Agent": user_agent,
                    "Cookie": f"datadome={cookie_response['cookie']}"
                }
                response = await client.get(url, headers=headers, proxies=proxy)
            
            # Parse product data
            soup = BeautifulSoup(response.text, 'html.parser')
            
            # Example extraction (adjust selectors for your target site)
            product_data = {
                'url': url,
                'title': soup.select_one('h1.product-title').text.strip() if soup.select_one('h1.product-title') else None,
                'price': soup.select_one('span.price').text.strip() if soup.select_one('span.price') else None,
                'in_stock': 'Out of Stock' not in response.text,
            }
            
            return product_data
    
    async def scrape_products(self, urls):
        """Scrape multiple products concurrently."""
        cfg = SDKConfig(host="example.com", api_key=self.api_key)
        
        async with AsyncDatadomeSDK(cfg=cfg) as sdk:
            tasks = [self.scrape_product(sdk, url) for url in urls]
            results = await asyncio.gather(*tasks, return_exceptions=True)
            
            # Filter out errors
            successful = [r for r in results if not isinstance(r, Exception)]
            failed = len(results) - len(successful)
            
            print(f"Scraped {len(successful)} products, {failed} failed")
            return successful

# Usage
async def main():
    scraper = EcommerceScraper(
        api_key="your_key",
        proxies=[
            "http://user:pass@proxy1.example.com:8080",
            "http://user:pass@proxy2.example.com:8080",
        ]
    )
    
    urls = [
        "https://example.com/product/123",
        "https://example.com/product/456",
        "https://example.com/product/789",
    ]
    
    products = await scraper.scrape_products(urls)
    
    for product in products:
        print(f"{product['title']}: {product['price']}")

asyncio.run(main())

When Not to Use ParallaxAPIs

While ParallaxAPIs is powerful, it's not always the right tool:

Don't use it if:

  1. The site isn't protected by DataDome or PerimeterX: If you're not hitting anti-bot systems, simple HTTP requests are cheaper and faster.
  2. You need full browser rendering: If the site heavily relies on client-side JavaScript to render content (SPAs with complex state management), consider the Playwright SDK instead—it combines browser automation with automatic anti-bot solving.
  3. You're scraping on a tiny budget: ParallaxAPIs is a paid service. For personal projects or low-volume scraping, free alternatives like undetected-chromedriver might work.
  4. The site uses other anti-bot systems: ParallaxAPIs currently supports DataDome and PerimeterX. If the site uses Cloudflare, Akamai, or Kasada, you'll need different tools.
  5. You need complex page interactions: For multi-step flows with lots of clicking, form filling, and navigation, the Playwright SDK bridges the gap—giving you full browser control with automatic anti-bot bypass.

Alternative SDK Options

If Python's request-based approach doesn't fit your use case, ParallaxAPIs has you covered:

Use the TypeScript/JavaScript SDK when:

  • Building Node.js scrapers or APIs
  • Integrating with existing JavaScript/TypeScript projects
  • You prefer async/await patterns in JS

Use the Go SDK when:

  • You need maximum performance and concurrency
  • Building microservices in Go
  • Memory efficiency is critical (Go's runtime is lighter than Python)

Use the Playwright SDK when:

  • You need full browser rendering (SPAs, dynamic content)
  • The site requires complex interactions (clicks, forms, scrolling)
  • You want browser automation with automatic anti-bot solving
  • You're already using Playwright and want to add anti-bot capabilities

Here's a quick example of the Playwright SDK in action:

import { chromium } from 'playwright';
import { DatadomePlaywright } from 'parallaxapis-sdk-playwright';

const browser = await chromium.launch();
const page = await browser.newPage();

const sdk = new DatadomePlaywright({ 
  host: "example.com", 
  apiKey: "your_key" 
});

// Navigate to protected page
await page.goto('https://example.com/protected');

// Automatically solve DataDome challenge if present
await sdk.solvePage(page);

// Continue with normal Playwright automation
const title = await page.title();
console.log(`Page title: ${title}`);

await browser.close();

This approach gives you the full power of browser automation while ParallaxAPIs handles all anti-bot challenges in the background.

Wrapping Up

ParallaxAPIs offers a compelling alternative to browser automation for bypassing DataDome and PerimeterX. It's faster, more scalable, and more cost-effective than spinning up headless browsers—assuming you're willing to pay for the service and don't need full browser rendering.

The key takeaways:

  • Request-based bypass is 10-20x faster than browser automation for these specific anti-bot systems
  • Resource usage is minimal compared to running Selenium or Puppeteer at scale
  • Multi-language support with SDKs for Python, TypeScript, Go, and Playwright
  • Cookie reuse and session management are critical for efficiency and avoiding detection
  • Proxy consistency matters—use the same proxy for cookie generation and subsequent requests
  • The SDK handles all the reverse engineering so you don't have to keep up with anti-bot system changes

If you're building production scrapers that hit DataDome or PerimeterX protected sites, ParallaxAPIs is worth exploring. Join their Discord, grab a free trial API key, and pick the SDK that fits your stack:

For alternative approaches, check out open-source tools like curl-impersonate for TLS fingerprinting or botasaurus for fortified browser automation. Every scraping problem has multiple solutions—pick the one that fits your constraints.