Bypass

How to bypass Imperva Incapsula in 2026: 6 working methods

Your scraper runs perfectly on test pages. Then you point it at Glassdoor, Udemy, or any major retail site and hit a wall.

403 Forbidden. "Incapsula incident ID" in the response. Your requests die before reaching the server.

Imperva Incapsula blocks roughly 95% of automated requests according to their 2025 Bad Bot Report. If you're scraping eCommerce, job boards, or financial sites at scale, you'll encounter it constantly.

This guide covers six proven bypass methods with working Python code. You'll learn HTTP client approaches, browser automation, and advanced fingerprinting strategies that work against current Imperva protections.

Each method has trade-offs. I'll help you pick the right one.

What is Imperva Incapsula?

Imperva Incapsula is a cloud-based Web Application Firewall (WAF) that sits between users and websites. It analyzes every incoming request before it reaches the origin server.

When your scraper connects to an Incapsula-protected site, the WAF generates a trust score based on hundreds of client characteristics. Low score? You're blocked.

Here's what Imperva checks:

TLS Fingerprinting (JA3/JA4): During the TLS handshake, your client sends information about supported cipher suites, extensions, and curves. Imperva hashes this into a fingerprint. Standard Python libraries like requests produce fingerprints that scream "bot."

IP Reputation: Imperva maintains massive databases of IP metadata. Datacenter IPs get flagged immediately. Residential and mobile IPs pass through.

HTTP Analysis: Header ordering, values, and the presence of browser-specific headers like Sec-CH-UA and Sec-Fetch-* matter. HTTP/1.1 connections raise suspicion since real browsers use HTTP/2 or HTTP/3.

JavaScript Fingerprinting: Imperva collects 180+ encrypted values through client-side JavaScript. Canvas fingerprints, WebGL data, audio context, navigator properties. Everything.

Behavioral Analysis: ML models detect timing patterns, navigation sequences, and request cadences. Bots often request pages in patterns humans never would.

The reese84 Cookie: Advanced challenge requiring browser execution. The cookie contains encrypted fingerprint data that HTTP-only approaches cannot generate.

Standard scraping tools fail multiple checks simultaneously. That's why simple User-Agent spoofing doesn't work anymore.

How to Identify Imperva Protection

Before attempting bypass, confirm you're dealing with Incapsula. Here's a detection function:

import requests

def detect_incapsula(url):
    """
    Detect if a website uses Imperva Incapsula protection.
    Returns dict with detection results.
    """
    try:
        response = requests.get(url, timeout=10)
        
        indicators = {
            'status_403': response.status_code == 403,
            'incapsula_text': 'incapsula' in response.text.lower(),
            'incident_id': 'incident id' in response.text.lower(),
            'powered_by': 'powered by' in response.text.lower(),
            'visid_cookie': 'visid_incap' in response.headers.get('Set-Cookie', ''),
            'incap_cookie': 'incap_ses' in response.headers.get('Set-Cookie', ''),
            'x_iinfo_header': 'X-Iinfo' in response.headers,
            'x_cdn_header': response.headers.get('X-CDN', '').lower() == 'imperva'
        }
        
        detected = any(indicators.values())
        
        return {
            'protected': detected,
            'indicators': indicators
        }
        
    except Exception as e:
        return {'error': str(e)}

# Test it
result = detect_incapsula('https://example.com')
print(f"Imperva detected: {result['protected']}")

Common block indicators include:

  • HTTP 403 Forbidden response
  • "Powered By Incapsula" text in HTML
  • "Incapsula incident ID" message
  • X-Iinfo response header
  • incap_ses_* and visid_incap cookies

6 Methods to Bypass Imperva Incapsula

Before diving in, here's a quick overview:

Method Difficulty Cost Best For Success Rate
curl_cffi Easy Free Basic protection, high speed Medium
Residential Proxies Easy $$ IP-based blocking High
Playwright Stealth Medium Free JavaScript challenges High
SeleniumBase UC Medium Free Complex automation High
nodriver Hard Free Maximum stealth Very High
Combined Approach Hard $$ Production at scale Very High
Quick recommendation: Start with curl_cffi for basic scraping. For JavaScript-heavy sites or reese84 challenges, jump to Playwright or nodriver.

Basic Methods

1. curl_cffi: TLS Fingerprint Impersonation

curl_cffi solves the TLS fingerprinting problem without browser overhead. It impersonates real browser fingerprints at the network level.

Best for: Sites with basic Incapsula protection, high-speed scraping
Difficulty: Easy
Cost: Free
Success rate: Medium (works on ~60% of Incapsula sites)

How it works

Standard HTTP libraries like requests or httpx produce TLS fingerprints that Imperva immediately recognizes as non-browser traffic. The cipher suite ordering, extension list, and curve preferences all differ from real browsers.

curl_cffi wraps curl-impersonate, which replicates exact byte sequences from real Chrome, Firefox, and Safari handshakes. The JA3/JA4 fingerprint matches what Imperva expects from a legitimate browser.

Implementation

First, install the library:

pip install curl_cffi

Basic usage with Chrome impersonation:

from curl_cffi import requests

def scrape_with_curl_cffi(url):
    """
    Scrape URL using curl_cffi with Chrome TLS fingerprint.
    """
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
        'Accept-Language': 'en-US,en;q=0.9',
        'Accept-Encoding': 'gzip, deflate, br',
        'Sec-CH-UA': '"Google Chrome";v="136", "Chromium";v="136", "Not.A/Brand";v="99"',
        'Sec-CH-UA-Mobile': '?0',
        'Sec-CH-UA-Platform': '"Windows"',
        'Sec-Fetch-Dest': 'document',
        'Sec-Fetch-Mode': 'navigate',
        'Sec-Fetch-Site': 'none',
        'Sec-Fetch-User': '?1',
        'Upgrade-Insecure-Requests': '1',
        'Connection': 'keep-alive'
    }
    
    response = requests.get(
        url,
        headers=headers,
        impersonate="chrome136",
        timeout=30
    )
    
    return response

# Usage
response = scrape_with_curl_cffi('https://target-site.com')
print(f"Status: {response.status_code}")

For session persistence and retry logic:

from curl_cffi.requests import Session
import time
import random

class IncapsulaBypass:
    """
    Advanced Incapsula bypass with session persistence and retry logic.
    """
    
    def __init__(self, proxy=None):
        self.session = Session(impersonate="chrome136")
        self.proxy = proxy
        
    def get(self, url, max_retries=3):
        """
        GET request with automatic retry on failure.
        """
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
            'Accept-Language': 'en-US,en;q=0.9',
            'Sec-CH-UA': '"Google Chrome";v="136", "Chromium";v="136"',
            'Sec-Fetch-Dest': 'document',
            'Sec-Fetch-Mode': 'navigate',
            'Sec-Fetch-Site': 'none'
        }
        
        proxies = {'http': self.proxy, 'https': self.proxy} if self.proxy else None
        
        for attempt in range(max_retries):
            try:
                response = self.session.get(
                    url,
                    headers=headers,
                    proxies=proxies,
                    timeout=30
                )
                
                if response.status_code == 200:
                    return response
                    
                if response.status_code == 403:
                    # Blocked - wait and retry
                    time.sleep(random.uniform(2, 5))
                    continue
                    
            except Exception as e:
                print(f"Attempt {attempt + 1} failed: {e}")
                time.sleep(random.uniform(1, 3))
                
        return None

# Usage
bypass = IncapsulaBypass(proxy='http://user:pass@proxy.example.com:8080')
response = bypass.get('https://target-site.com')

Available impersonation profiles include: chrome99, chrome110, chrome120, chrome131, chrome136, safari17, safari18. Use the latest Chrome version for best results.

Pros and cons

Pros:

  • Fast execution (no browser overhead)
  • Low resource usage
  • Simple API similar to requests
  • Handles HTTP/2 automatically

Cons:

  • Cannot execute JavaScript challenges
  • Fails against reese84 cookie requirements
  • Some sites detect curl-impersonate patterns

When to use this method

Use curl_cffi when:

  • Target site doesn't require JavaScript execution
  • Speed is critical
  • You need to make thousands of requests quickly
  • Basic Incapsula protection without advanced challenges

Avoid this method if:

  • Site shows JavaScript challenge pages
  • You see reese84 cookie requirements
  • curl_cffi returns 403 after multiple attempts with good proxies

2. Residential Proxy Rotation

Datacenter IPs get blocked almost immediately by Imperva. Residential proxies are essential for consistent access.

Best for: IP-based blocking, scaling scraping operations
Difficulty: Easy
Cost: $$ (typically $5-15 per GB)
Success rate: High (when combined with other methods)

How it works

Imperva maintains databases of IP reputation. Every IP has metadata: datacenter vs residential, ASN, geographic location, historical behavior patterns.

Datacenter IPs from AWS, Google Cloud, or Digital Ocean get flagged immediately. Residential IPs appear to come from real ISP customers and pass IP reputation checks.

Implementation

from curl_cffi.requests import Session
import random
import time

class ProxyRotator:
    """
    Manage rotating residential proxies for Incapsula bypass.
    """
    
    def __init__(self, proxy_list):
        """
        Initialize with list of residential proxy URLs.
        Format: http://user:pass@host:port
        """
        self.proxies = proxy_list
        self.current_index = 0
        self.failed_proxies = set()
        
    def get_next(self):
        """
        Get next working proxy from rotation.
        """
        attempts = 0
        while attempts < len(self.proxies):
            proxy = self.proxies[self.current_index]
            self.current_index = (self.current_index + 1) % len(self.proxies)
            
            if proxy not in self.failed_proxies:
                return proxy
                
            attempts += 1
            
        # All proxies failed - reset and try again
        self.failed_proxies.clear()
        return self.proxies[0]
        
    def mark_failed(self, proxy):
        """
        Mark a proxy as failed.
        """
        self.failed_proxies.add(proxy)


def scrape_with_rotation(url, proxy_rotator, max_retries=3):
    """
    Scrape URL with automatic proxy rotation on failure.
    """
    session = Session(impersonate="chrome136")
    
    for attempt in range(max_retries):
        proxy = proxy_rotator.get_next()
        proxies = {'http': proxy, 'https': proxy}
        
        try:
            response = session.get(
                url,
                proxies=proxies,
                timeout=30
            )
            
            if response.status_code == 200:
                return response
                
            if response.status_code == 403:
                proxy_rotator.mark_failed(proxy)
                time.sleep(random.uniform(1, 3))
                continue
                
        except Exception as e:
            proxy_rotator.mark_failed(proxy)
            print(f"Proxy {proxy} failed: {e}")
            
    return None


# Usage
proxies = [
    'http://user:pass@resi1.proxy.com:8080',
    'http://user:pass@resi2.proxy.com:8080',
    'http://user:pass@resi3.proxy.com:8080',
]

rotator = ProxyRotator(proxies)
response = scrape_with_rotation('https://target-site.com', rotator)

For high-volume scraping, use sticky sessions to maintain the same IP for related requests:

class StickyProxySession:
    """
    Maintain sticky proxy session for related requests.
    """
    
    def __init__(self, proxy_endpoint, session_duration=300):
        """
        Args:
            proxy_endpoint: Residential proxy endpoint with session support
            session_duration: Seconds to maintain same IP
        """
        self.endpoint = proxy_endpoint
        self.duration = session_duration
        self.session_id = None
        self.session_start = 0
        
    def get_proxy(self):
        """
        Get proxy URL with sticky session ID.
        """
        current_time = time.time()
        
        # Generate new session if expired
        if self.session_id is None or (current_time - self.session_start) > self.duration:
            self.session_id = f"session_{random.randint(10000, 99999)}"
            self.session_start = current_time
            
        # Format depends on your proxy provider
        # Common format: user-session-{id}:pass@host:port
        return self.endpoint.replace('user:', f'user-session-{self.session_id}:')

Pros and cons

Pros:

  • Essential for bypassing IP reputation checks
  • Enables geographic targeting
  • Scales to high request volumes

Cons:

  • Ongoing cost per GB
  • Slower than direct connections
  • Doesn't solve TLS or JavaScript challenges alone

When to use this method

Residential proxies are nearly mandatory for serious Incapsula bypass. Use them in combination with other methods.

Avoid cheap datacenter proxies. They'll fail immediately.

Intermediate Methods

3. Playwright with Stealth Mode

Playwright handles JavaScript challenges natively. Modern versions (2025-2026) include built-in stealth features that hide automation markers.

Best for: JavaScript-heavy sites, reese84 challenges
Difficulty: Medium
Cost: Free
Success rate: High

How it works

Playwright launches a real Chromium browser. It executes JavaScript just like a human visitor, generating authentic fingerprints and handling challenges automatically.

The key advantage: Playwright solves reese84 cookie challenges without additional work. The browser collects fingerprint data and generates the required encrypted payload.

Implementation

Install Playwright:

pip install playwright
playwright install chromium

Basic stealth configuration:

from playwright.sync_api import sync_playwright
import random
import time

def scrape_with_playwright(url):
    """
    Scrape URL using Playwright with stealth settings.
    """
    with sync_playwright() as p:
        # Launch with stealth arguments
        browser = p.chromium.launch(
            headless=True,
            args=[
                '--disable-blink-features=AutomationControlled',
                '--disable-dev-shm-usage',
                '--no-sandbox',
                '--disable-setuid-sandbox',
                '--disable-infobars',
                '--window-size=1920,1080',
                '--start-maximized'
            ]
        )
        
        # Create context with realistic settings
        context = browser.new_context(
            viewport={'width': 1920, 'height': 1080},
            user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
            locale='en-US',
            timezone_id='America/New_York'
        )
        
        page = context.new_page()
        
        # Remove automation indicators
        page.add_init_script("""
            Object.defineProperty(navigator, 'webdriver', {
                get: () => undefined
            });
            
            // Override permissions
            const originalQuery = window.navigator.permissions.query;
            window.navigator.permissions.query = (parameters) => (
                parameters.name === 'notifications' ?
                    Promise.resolve({ state: Notification.permission }) :
                    originalQuery(parameters)
            );
        """)
        
        # Navigate with realistic timing
        page.goto(url, wait_until='networkidle')
        
        # Wait for any challenges to complete
        time.sleep(random.uniform(2, 4))
        
        content = page.content()
        
        browser.close()
        
        return content

# Usage
html = scrape_with_playwright('https://target-site.com')
print(f"Retrieved {len(html)} characters")

For proxy integration:

def scrape_with_playwright_proxy(url, proxy_url):
    """
    Scrape using Playwright with residential proxy.
    """
    with sync_playwright() as p:
        browser = p.chromium.launch(
            headless=True,
            proxy={
                'server': proxy_url.split('@')[1] if '@' in proxy_url else proxy_url,
                'username': proxy_url.split('://')[1].split(':')[0] if '@' in proxy_url else None,
                'password': proxy_url.split(':')[2].split('@')[0] if '@' in proxy_url else None
            },
            args=['--disable-blink-features=AutomationControlled']
        )
        
        context = browser.new_context(
            viewport={'width': 1920, 'height': 1080},
            user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        )
        
        page = context.new_page()
        page.goto(url, wait_until='networkidle')
        
        content = page.content()
        browser.close()
        
        return content

Pros and cons

Pros:

  • Executes JavaScript challenges automatically
  • Generates authentic fingerprints
  • Handles reese84 cookie natively
  • Cross-browser support (Chromium, Firefox, WebKit)

Cons:

  • Slower than HTTP clients (5-10x)
  • Higher resource usage (300-500MB per instance)
  • Requires browser binary installation

When to use this method

Use Playwright when:

  • curl_cffi returns challenge pages
  • Target requires JavaScript execution
  • You need to interact with dynamic content
  • Site uses reese84 cookies

Avoid this method if:

  • Speed is critical and basic protection only
  • Running on limited resources

4. SeleniumBase Undetected ChromeDriver

SeleniumBase with Undetected ChromeDriver patches automation markers at the driver level. It's well-maintained and handles many detection methods automatically.

Best for: Complex automation workflows, form submission
Difficulty: Medium
Cost: Free
Success rate: High

How it works

Standard Selenium exposes multiple automation indicators: navigator.webdriver is true, Chrome DevTools Protocol markers are visible, and driver executables leave traces.

SeleniumBase UC mode patches these at a low level. It modifies the ChromeDriver binary to remove telltale signs and configures Chrome to hide automation flags.

Implementation

Install SeleniumBase:

pip install seleniumbase

Basic usage with UC mode:

from seleniumbase import Driver
import time
import random

def scrape_with_seleniumbase(url):
    """
    Scrape URL using SeleniumBase in Undetected ChromeDriver mode.
    """
    # Initialize driver with UC mode
    driver = Driver(
        uc=True,
        headless=True,
        agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36'
    )
    
    try:
        driver.get(url)
        
        # Wait for page and any challenges
        time.sleep(random.uniform(3, 5))
        
        # Check for Incapsula challenge
        if 'incapsula' in driver.page_source.lower():
            # Wait longer for challenge resolution
            time.sleep(random.uniform(5, 10))
            
        content = driver.page_source
        
        return content
        
    finally:
        driver.quit()


def scrape_with_seleniumbase_proxy(url, proxy):
    """
    SeleniumBase with proxy support.
    """
    driver = Driver(
        uc=True,
        headless=True,
        proxy=proxy  # Format: host:port or user:pass@host:port
    )
    
    try:
        driver.get(url)
        time.sleep(random.uniform(3, 5))
        return driver.page_source
        
    finally:
        driver.quit()


# Usage
html = scrape_with_seleniumbase('https://target-site.com')
print(f"Retrieved {len(html)} characters")

For handling CAPTCHAs with SeleniumBase's built-in features:

from seleniumbase import SB

def scrape_with_captcha_handling(url):
    """
    SeleniumBase with automatic CAPTCHA handling.
    """
    with SB(uc=True, headless=True) as sb:
        sb.open(url)
        
        # SeleniumBase can auto-click CAPTCHA checkboxes
        if sb.is_element_visible('iframe[src*="captcha"]'):
            sb.uc_gui_click_captcha()
            
        sb.sleep(3)
        
        return sb.get_page_source()

Pros and cons

Pros:

  • Mature, well-maintained project
  • Built-in CAPTCHA handling
  • Familiar Selenium API
  • Good documentation

Cons:

  • Slower than Playwright
  • ChromeDriver version must match Chrome
  • Some detection methods still work against it

When to use this method

Use SeleniumBase when:

  • You need robust CAPTCHA handling
  • Team is familiar with Selenium
  • Running complex multi-step automation

Advanced Methods

5. nodriver: CDP-Minimal Automation

nodriver represents the cutting edge of stealth automation. It communicates directly with Chrome via CDP while avoiding the markers that standard automation leaves behind.

Best for: Sites with advanced bot detection
Difficulty: Hard
Cost: Free
Success rate: Very High

How it works

Standard automation tools (Selenium, Playwright, Puppeteer) control browsers through WebDriver protocol or heavy CDP usage. These protocols leave detectable traces.

nodriver takes a different approach. It uses minimal CDP communication and emulates real user behavior through native OS-level inputs. This makes it invisible to most detection methods.

Recent benchmarks show nodriver achieving 25% success rate against major anti-bots in default configuration. Its fork, zendriver, achieves 75% with additional optimizations.

Implementation

Install nodriver:

pip install nodriver

Basic usage:

import nodriver as uc
import asyncio

async def scrape_with_nodriver(url):
    """
    Scrape URL using nodriver for maximum stealth.
    """
    browser = await uc.start(
        headless=True,
        browser_args=[
            '--disable-blink-features=AutomationControlled',
            '--window-size=1920,1080'
        ]
    )
    
    try:
        page = await browser.get(url)
        
        # Wait for dynamic content
        await asyncio.sleep(3)
        
        # Get page content
        content = await page.get_content()
        
        return content
        
    finally:
        await browser.close()


# Run async function
html = asyncio.run(scrape_with_nodriver('https://target-site.com'))
print(f"Retrieved {len(html)} characters")

For proxy support with nodriver (requires SOCKS5):

import nodriver as uc
import asyncio

async def scrape_with_nodriver_proxy(url, socks5_proxy):
    """
    nodriver with SOCKS5 proxy.
    Format: socks5://user:pass@host:port
    """
    browser = await uc.start(
        headless=True,
        browser_args=[
            f'--proxy-server={socks5_proxy}',
            '--disable-blink-features=AutomationControlled'
        ]
    )
    
    try:
        page = await browser.get(url)
        await asyncio.sleep(3)
        return await page.get_content()
        
    finally:
        await browser.close()

Consider zendriver for higher success rates:

pip install zendriver
import zendriver as zd
import asyncio

async def scrape_with_zendriver(url):
    """
    zendriver for enhanced stealth (75% success vs 25% nodriver).
    """
    browser = await zd.start(headless=True)
    
    try:
        page = await browser.get(url)
        await asyncio.sleep(3)
        return await page.get_content()
        
    finally:
        await browser.close()

Pros and cons

Pros:

  • Highest stealth of any automation framework
  • Async-first architecture
  • Minimal detection surface
  • Active development

Cons:

  • Async-only API (learning curve)
  • SOCKS5 proxy requirement
  • Less documentation than mainstream tools
  • Chromium-only

When to use this method

Use nodriver/zendriver when:

  • Other browser automation methods fail
  • Target has advanced behavioral analysis
  • Maximum stealth is required
  • You're comfortable with async Python

6. Combined Approach: Full-Stack Bypass

Production scraping against Incapsula requires multiple layers working together. Here's a complete solution combining the best methods.

Best for: Production systems, maximum reliability
Difficulty: Hard
Cost: $$ (proxies)
Success rate: Very High

Implementation

from curl_cffi.requests import Session as CurlSession
from playwright.sync_api import sync_playwright
import random
import time
from typing import Optional, Dict

class IncapsulaFullBypass:
    """
    Production-ready Incapsula bypass combining multiple methods.
    """
    
    def __init__(self, proxy_list: list):
        self.proxies = proxy_list
        self.proxy_index = 0
        self.curl_session = None
        
    def get_proxy(self) -> str:
        """Rotate through proxy list."""
        proxy = self.proxies[self.proxy_index]
        self.proxy_index = (self.proxy_index + 1) % len(self.proxies)
        return proxy
        
    def try_curl_cffi(self, url: str) -> Optional[str]:
        """
        Attempt 1: Fast HTTP client with TLS impersonation.
        """
        if self.curl_session is None:
            self.curl_session = CurlSession(impersonate="chrome136")
            
        proxy = self.get_proxy()
        
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
            'Accept-Language': 'en-US,en;q=0.9',
            'Sec-CH-UA': '"Google Chrome";v="136"',
            'Sec-Fetch-Dest': 'document',
            'Sec-Fetch-Mode': 'navigate'
        }
        
        try:
            response = self.curl_session.get(
                url,
                headers=headers,
                proxies={'http': proxy, 'https': proxy},
                timeout=30
            )
            
            if response.status_code == 200:
                # Check for challenge pages
                if 'incapsula' not in response.text.lower():
                    return response.text
                    
        except Exception as e:
            print(f"curl_cffi failed: {e}")
            
        return None
        
    def try_playwright(self, url: str) -> Optional[str]:
        """
        Attempt 2: Full browser automation for JavaScript challenges.
        """
        proxy = self.get_proxy()
        
        # Parse proxy URL
        proxy_parts = proxy.replace('http://', '').replace('https://', '')
        
        with sync_playwright() as p:
            browser_args = ['--disable-blink-features=AutomationControlled']
            
            proxy_config = None
            if '@' in proxy_parts:
                auth, server = proxy_parts.rsplit('@', 1)
                user, password = auth.split(':')
                proxy_config = {
                    'server': f'http://{server}',
                    'username': user,
                    'password': password
                }
            else:
                proxy_config = {'server': f'http://{proxy_parts}'}
                
            browser = p.chromium.launch(
                headless=True,
                proxy=proxy_config,
                args=browser_args
            )
            
            try:
                context = browser.new_context(
                    viewport={'width': 1920, 'height': 1080},
                    user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
                )
                
                page = context.new_page()
                
                # Remove webdriver indicator
                page.add_init_script("""
                    Object.defineProperty(navigator, 'webdriver', {
                        get: () => undefined
                    });
                """)
                
                page.goto(url, wait_until='networkidle', timeout=60000)
                
                # Wait for challenges to resolve
                time.sleep(random.uniform(3, 6))
                
                content = page.content()
                
                # Verify we got real content
                if 'incapsula' not in content.lower() and len(content) > 1000:
                    return content
                    
            except Exception as e:
                print(f"Playwright failed: {e}")
                
            finally:
                browser.close()
                
        return None
        
    def scrape(self, url: str, max_attempts: int = 3) -> Dict:
        """
        Main scraping method with fallback chain.
        
        Returns dict with content and method used.
        """
        for attempt in range(max_attempts):
            # Try fast method first
            content = self.try_curl_cffi(url)
            if content:
                return {
                    'success': True,
                    'method': 'curl_cffi',
                    'content': content,
                    'attempts': attempt + 1
                }
                
            # Fallback to browser automation
            content = self.try_playwright(url)
            if content:
                return {
                    'success': True,
                    'method': 'playwright',
                    'content': content,
                    'attempts': attempt + 1
                }
                
            # Wait before retry
            time.sleep(random.uniform(5, 10))
            
        return {
            'success': False,
            'method': None,
            'content': None,
            'attempts': max_attempts
        }


# Usage
proxies = [
    'http://user:pass@resi1.example.com:8080',
    'http://user:pass@resi2.example.com:8080',
    'http://user:pass@resi3.example.com:8080',
]

bypass = IncapsulaFullBypass(proxies)
result = bypass.scrape('https://target-site.com')

if result['success']:
    print(f"Success with {result['method']} after {result['attempts']} attempts")
    print(f"Content length: {len(result['content'])}")
else:
    print("All bypass methods failed")

Which Method Should You Use?

Situation Best Method
Basic protection, speed critical curl_cffi + residential proxies
JavaScript challenges present Playwright with stealth
Complex automation needed SeleniumBase UC
Maximum stealth required nodriver/zendriver
Production at scale Combined approach

Start simple. Try curl_cffi first. If blocked, escalate to browser automation.

Always use residential proxies. Datacenter IPs fail immediately regardless of method.

Common Errors and Solutions

"403 Forbidden" after successful initial request

Cause: IP got flagged after behavioral analysis or too many requests.

Fix: Rotate to new proxy. Implement random delays between requests (2-10 seconds). Distribute traffic across multiple IPs.

"Incapsula incident ID" in response

Cause: Request failed multiple detection checks.

Fix: Switch from HTTP client to browser automation. Verify residential proxy is working. Check TLS fingerprint with browserleaks.com.

JavaScript challenge loops forever

Cause: Browser automation detected, challenge keeps regenerating.

Fix: Use nodriver instead of Playwright. Clear cookies between attempts. Try different residential IP.

Cause: JavaScript fingerprinting blocked or incomplete.

Fix: Must use browser automation. Ensure JavaScript enabled. Wait longer for challenge completion (10+ seconds).

Rate limited (429 status)

Cause: Too many requests from same IP or session.

Fix: Implement exponential backoff. Rotate proxies more frequently. Reduce concurrent requests.

import time
import random

def exponential_backoff(attempt, base_delay=1, max_delay=60):
    """
    Calculate delay with exponential backoff and jitter.
    """
    delay = min(base_delay * (2 ** attempt), max_delay)
    jitter = random.uniform(0, delay * 0.1)
    return delay + jitter

# Usage in retry loop
for attempt in range(5):
    response = make_request()
    
    if response.status_code == 429:
        delay = exponential_backoff(attempt)
        print(f"Rate limited. Waiting {delay:.1f} seconds...")
        time.sleep(delay)
        continue
        
    break

Ethical Considerations

Before bypassing Incapsula protection, consider:

Terms of Service: Most sites prohibit scraping in their ToS. Bypassing protection may violate these terms.

Legal implications: Laws vary by jurisdiction. The Computer Fraud and Abuse Act (US) and similar laws elsewhere may apply. Consult legal counsel for commercial projects.

Responsible use:

  • Only scrape public data
  • Respect rate limits even when you can bypass them
  • Cache data to minimize requests
  • Identify yourself with contact info in headers when appropriate
  • Don't overload target servers

When to use official APIs instead:

If a site offers an API, use it. APIs are faster, legal, and more reliable than scraping protected sites.

Imperva continues evolving. Monitor these emerging methods:

JA4 Fingerprinting: The successor to JA3 provides more granular TLS analysis. Libraries must update impersonation signatures regularly.

HTTP/3 Analysis: QUIC protocol fingerprints now analyzed. Current bypass tools need HTTP/3 support.

Behavioral ML Models: Machine learning detects subtle patterns in navigation timing and click sequences. Simple randomization may not suffice.

Device Attestation: Some implementations verify hardware characteristics through WebAuthn. This requires actual browser execution.

Canvas Fingerprint Verification: Beyond collection, systems verify render consistency across requests from the same session.

Stay updated with curl_cffi releases, Playwright updates, and anti-detection plugin changes. Join web scraping communities for early warnings about new detection methods.

Conclusion

Imperva Incapsula uses layered detection: TLS fingerprinting, IP reputation, HTTP analysis, JavaScript challenges, and behavioral monitoring. Effective bypass requires addressing multiple vectors simultaneously.

Start with curl_cffi for basic requests. The TLS impersonation handles fingerprint detection without browser overhead. Add residential proxies for IP reputation.

For JavaScript-heavy sites, use Playwright or nodriver. These execute challenges natively while hiding automation markers.

The reese84 cookie challenge requires browser execution. HTTP-only approaches cannot generate the required fingerprint payload. Plan for browser automation when targeting sites with this protection.

Match your method to the protection level:

Protection Level Recommended Approach
Basic curl_cffi + residential proxies
Medium Playwright with stealth
Advanced nodriver + rotating residential
Enterprise Combined approach with fallbacks

Always test before scaling. What works today may need adjustment tomorrow as detection methods evolve.