Imperva Incapsula blocks roughly 95% of automated requests according to their 2025 Bad Bot Report. This cloud-based WAF uses multi-layered detection including TLS fingerprinting, behavioral analysis, and JavaScript challenges to identify scrapers.
This guide covers proven bypass techniques with working code examples. You'll learn HTTP client approaches, headless browser automation, and advanced fingerprinting strategies that work against current Imperva protections.
What is the Main Challenge with Imperva Incapsula?
Imperva Incapsula uses a trust-score system that analyzes hundreds of client characteristics before allowing access. Unlike simple IP blocking, it combines TLS handshake analysis, HTTP header inspection, JavaScript execution tests, and behavioral monitoring into a unified detection engine.
The protection runs on CDN edge servers, meaning your request gets analyzed before reaching the actual website. A low trust score triggers blocks, CAPTCHAs, or JavaScript challenges that most scrapers can't handle.
How to Identify Imperva Incapsula Protection
Before attempting any bypass, confirm you're dealing with Imperva. Here's a Python function that detects Incapsula protection markers:
import requests
def detect_incapsula(url):
"""
Detect if a website uses Imperva Incapsula protection.
Returns dict with detection results.
"""
markers = {
'html_indicators': [],
'header_indicators': [],
'cookie_indicators': [],
'is_protected': False
}
try:
response = requests.get(url, timeout=10)
html = response.text.lower()
# Check HTML content for Incapsula markers
html_markers = [
'_incapsula_resource',
'incapsula incident id',
'powered by incapsula',
'subject=waf block page',
'visid_incap'
]
for marker in html_markers:
if marker in html:
markers['html_indicators'].append(marker)
# Check response headers
headers = response.headers
if 'X-Iinfo' in headers:
markers['header_indicators'].append('X-Iinfo header present')
if 'X-CDN' in headers:
cdn_value = headers.get('X-CDN', '').lower()
if 'incapsula' in cdn_value or 'imperva' in cdn_value:
markers['header_indicators'].append(f'X-CDN: {cdn_value}')
# Check cookies
for cookie in response.cookies:
if 'incap_ses' in cookie.name or 'visid_incap' in cookie.name:
markers['cookie_indicators'].append(cookie.name)
# Determine if protected
markers['is_protected'] = bool(
markers['html_indicators'] or
markers['header_indicators'] or
markers['cookie_indicators']
)
return markers
except Exception as e:
return {'error': str(e), 'is_protected': None}
# Usage
result = detect_incapsula('https://example-protected-site.com')
print(f"Protected: {result['is_protected']}")
print(f"Indicators found: {result}")
This function checks three detection vectors simultaneously. The HTML markers appear in block pages and challenge scripts. Header indicators reveal CDN configuration. Cookie patterns confirm active session tracking.
Run this check first to avoid wasting time on sites that use different protection systems.
Method 1: curl_cffi with TLS Fingerprint Impersonation
Standard Python HTTP libraries expose their automation nature through TLS handshake characteristics. The curl_cffi library solves this by replicating exact browser TLS fingerprints.
Installation
pip install curl_cffi
Basic Implementation
from curl_cffi import requests
def bypass_incapsula_basic(url):
"""
Basic Incapsula bypass using curl_cffi TLS impersonation.
Impersonates Chrome 136 fingerprint.
"""
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.9',
'accept-encoding': 'gzip, deflate, br',
'sec-ch-ua': '"Google Chrome";v="136", "Not-A.Brand";v="8", "Chromium";v="136"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36'
}
response = requests.get(
url,
headers=headers,
impersonate="chrome136", # Match Chrome 136 TLS fingerprint
timeout=30
)
return response
# Usage
response = bypass_incapsula_basic('https://protected-site.com')
print(f"Status: {response.status_code}")
print(f"Content length: {len(response.text)}")
The impersonate parameter tells curl_cffi to match Chrome 136's exact JA3/JA4 fingerprint during TLS negotiation. This includes cipher suites, extensions, and curve preferences that Imperva checks.
Advanced Implementation with Session Management
from curl_cffi.requests import Session
import time
import random
class IncapsulaBypass:
"""
Advanced Incapsula bypass with session persistence and retry logic.
"""
def __init__(self, proxy=None):
self.session = Session(impersonate="chrome136")
self.proxy = proxy
self.base_headers = self._get_browser_headers()
def _get_browser_headers(self):
"""Generate realistic Chrome headers."""
return {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.9',
'accept-encoding': 'gzip, deflate, br, zstd',
'cache-control': 'max-age=0',
'sec-ch-ua': '"Google Chrome";v="136", "Chromium";v="136", "Not-A.Brand";v="8"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36'
}
def _add_referer(self, url):
"""Add referer header for subsequent requests."""
from urllib.parse import urlparse
parsed = urlparse(url)
self.base_headers['referer'] = f"{parsed.scheme}://{parsed.netloc}/"
self.base_headers['sec-fetch-site'] = 'same-origin'
def get(self, url, max_retries=3):
"""
Make GET request with retry logic and delay randomization.
"""
proxies = {'http': self.proxy, 'https': self.proxy} if self.proxy else None
for attempt in range(max_retries):
try:
# Random delay between 1-3 seconds
time.sleep(random.uniform(1.0, 3.0))
response = self.session.get(
url,
headers=self.base_headers,
proxies=proxies,
timeout=30,
allow_redirects=True
)
# Check for Incapsula challenge
if self._is_challenge(response):
print(f"Challenge detected on attempt {attempt + 1}")
time.sleep(random.uniform(3.0, 5.0))
continue
# Update headers for next request
self._add_referer(url)
return response
except Exception as e:
print(f"Request failed (attempt {attempt + 1}): {e}")
if attempt == max_retries - 1:
raise
return None
def _is_challenge(self, response):
"""Check if response contains Incapsula challenge."""
if response.status_code == 403:
return True
text = response.text.lower()
return '_incapsula_resource' in text or 'incapsula incident' in text
# Usage with residential proxy
bypass = IncapsulaBypass(proxy="http://user:pass@residential-proxy:8080")
response = bypass.get('https://protected-site.com/data')
This class maintains session cookies across requests. The session persistence helps build trust score over multiple interactions. Random delays between requests mimic human browsing patterns.
Method 2: Playwright with Stealth Plugin
For sites requiring JavaScript execution, headless browsers provide authentic fingerprints. Playwright combined with stealth modifications passes most Incapsula checks.
Installation
pip install playwright playwright-stealth
playwright install chromium
Python Implementation
import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import stealth_async
import random
async def bypass_with_playwright(url):
"""
Bypass Incapsula using Playwright with stealth modifications.
Handles JavaScript challenges automatically.
"""
async with async_playwright() as p:
# Launch browser with anti-detection flags
browser = await p.chromium.launch(
headless=True,
args=[
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-infobars',
'--window-size=1920,1080',
'--start-maximized'
]
)
# Create context with realistic viewport and user agent
context = await browser.new_context(
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
locale='en-US',
timezone_id='America/New_York'
)
page = await context.new_page()
# Apply stealth modifications
await stealth_async(page)
# Navigate with timeout for challenge resolution
await page.goto(url, wait_until='networkidle', timeout=60000)
# Wait for potential challenge resolution
await asyncio.sleep(random.uniform(2.0, 4.0))
# Simulate human-like mouse movement
await page.mouse.move(random.randint(100, 500), random.randint(100, 500))
await page.mouse.move(random.randint(200, 600), random.randint(200, 600))
# Check for iframe challenges
iframe = await page.query_selector('iframe#main-iframe')
if iframe:
await handle_incapsula_iframe(page)
# Get final content
content = await page.content()
cookies = await context.cookies()
await browser.close()
return {
'content': content,
'cookies': cookies,
'url': page.url
}
async def handle_incapsula_iframe(page):
"""
Handle Incapsula iframe challenge by simulating interaction.
"""
print("Incapsula iframe detected, attempting resolution...")
# Move mouse over iframe area
await page.mouse.move(300, 300)
await asyncio.sleep(random.uniform(0.5, 1.0))
# Click to trigger challenge
await page.mouse.click(300, 300)
# Type some characters (triggers keyboard detection)
await page.keyboard.type('test')
await page.keyboard.press('Tab')
# Scroll to trigger scroll detection
await page.evaluate('window.scrollBy(0, 100)')
# Wait for iframe to clear
try:
await page.wait_for_function(
'() => !document.querySelector("iframe#main-iframe")',
timeout=15000
)
print("Challenge resolved!")
except:
print("Challenge timeout - may need manual intervention")
# Run async function
result = asyncio.run(bypass_with_playwright('https://protected-site.com'))
print(f"Page URL: {result['url']}")
print(f"Content length: {len(result['content'])}")
The stealth plugin patches over 200 browser fingerprinting vectors. It removes navigator.webdriver, fixes plugin enumeration, and adjusts other properties that headless browsers typically expose.
Node.js Implementation with Puppeteer Stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
// Apply stealth plugin with all evasions
puppeteer.use(StealthPlugin());
async function bypassIncapsula(url) {
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
'--disable-setuid-sandbox',
'--window-size=1920,1080'
]
});
const page = await browser.newPage();
// Set viewport and user agent
await page.setViewport({ width: 1920, height: 1080 });
await page.setUserAgent(
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36'
);
// Navigate to target
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 60000
});
// Random delay for challenge resolution
await new Promise(r => setTimeout(r, 2000 + Math.random() * 3000));
// Simulate mouse movement
await page.mouse.move(100 + Math.random() * 400, 100 + Math.random() * 400);
await page.mouse.move(200 + Math.random() * 400, 200 + Math.random() * 400);
// Handle iframe challenge if present
const iframeHandle = await page.$('iframe#main-iframe');
if (iframeHandle) {
console.log('Incapsula iframe detected');
await page.mouse.click(300, 300);
await page.keyboard.type('test');
await page.evaluate(() => window.scrollBy(0, 100));
await new Promise(r => setTimeout(r, 3000));
}
const content = await page.content();
const cookies = await page.cookies();
await browser.close();
return { content, cookies };
}
// Usage
bypassIncapsula('https://protected-site.com')
.then(result => console.log('Success:', result.content.length))
.catch(err => console.error('Failed:', err));
Puppeteer Stealth handles Chrome-specific evasions including console.debug patching, iframe.contentWindow fixes, and WebGL vendor masking.
Method 3: nodriver for Maximum Stealth
nodriver is the successor to undetected-chromedriver. It uses direct CDP (Chrome DevTools Protocol) communication without Selenium/WebDriver, eliminating common detection vectors.
Installation
pip install nodriver
Implementation
import nodriver as uc
import asyncio
async def bypass_with_nodriver(url):
"""
Bypass Incapsula using nodriver for maximum stealth.
Uses direct CDP communication without WebDriver.
"""
# Start browser with stealth configuration
browser = await uc.start(
headless=True,
browser_args=[
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox'
]
)
# Get first tab
page = await browser.get(url)
# Wait for page load and potential challenges
await asyncio.sleep(3)
# Check for Incapsula iframe
iframe = await page.select('iframe#main-iframe')
if iframe:
print("Incapsula challenge detected")
# Simulate human interaction
await page.mouse.move(300, 300)
await page.mouse.click(300, 300)
await page.keyboard.send_keys('test')
await page.scroll_down(100)
# Wait for resolution
await asyncio.sleep(5)
# Get page content
content = await page.get_content()
# Get cookies for session persistence
cookies = await browser.cookies.get_all()
await browser.stop()
return {
'content': content,
'cookies': cookies
}
# Run
result = asyncio.run(bypass_with_nodriver('https://protected-site.com'))
print(f"Content: {len(result['content'])} chars")
nodriver bypasses WebDriver detection entirely. It communicates directly with Chrome through the DevTools Protocol, avoiding the automation flags that Imperva Incapsula checks.
Method 4: Handling the reese84 Cookie Challenge
Some Imperva implementations use the reese84 cookie for advanced fingerprinting. This cookie is generated after submitting an encrypted payload containing browser fingerprints.
Understanding the reese84 Flow
The reese84 challenge works in three steps:
- Initial request returns JavaScript that collects fingerprints
- Script generates encrypted payload with device characteristics
- Payload is POSTed to receive the authenticated reese84 cookie
Manual reese84 Resolution
import re
import base64
from curl_cffi import requests
class Reese84Solver:
"""
Handle Incapsula reese84 cookie challenges.
"""
def __init__(self):
self.session = requests.Session(impersonate="chrome136")
def extract_script_url(self, html):
"""Extract the reese84 script URL from HTML."""
pattern = r'src="(/_Incapsula_Resource\?[^"]+)"'
match = re.search(pattern, html)
if match:
return match.group(1)
return None
def find_challenge_endpoint(self, html):
"""Find the endpoint where fingerprint payload is POSTed."""
# Look for endpoint pattern like: /random-words-here?d=domain.com
pattern = r'([a-zA-Z0-9\-]+\?d=[a-zA-Z0-9\.\-]+)'
match = re.search(pattern, html)
if match:
return match.group(1)
return None
def solve(self, url):
"""
Attempt to solve reese84 challenge.
Returns session with valid cookies if successful.
"""
# Initial request to get challenge
response = self.session.get(url)
if response.status_code == 200 and '_incapsula_resource' not in response.text.lower():
print("No challenge detected")
return self.session
# Extract script URL
script_url = self.extract_script_url(response.text)
if not script_url:
print("Could not find Incapsula script")
return None
# The script URL needs to be fetched and processed
# For complex reese84, a headless browser is recommended
print(f"reese84 script found: {script_url}")
print("Note: Full reese84 resolution requires browser execution")
return self.session
# Usage
solver = Reese84Solver()
session = solver.solve('https://protected-site.com')
For complete reese84 resolution, use Playwright or Puppeteer to execute the JavaScript challenge natively. The browser handles fingerprint collection and payload generation automatically.
Method 5: Residential Proxy Integration
Datacenter IPs get blocked almost immediately by Imperva Incapsula. Residential proxies are essential for consistent access.
Implementation with Proxy Rotation
from curl_cffi.requests import Session
import random
import time
class ProxyRotator:
"""
Manage rotating residential proxies for Incapsula bypass.
"""
def __init__(self, proxy_list):
"""
Initialize with list of residential proxy URLs.
Format: ['http://user:pass@ip:port', ...]
"""
self.proxies = proxy_list
self.current_index = 0
self.failed_proxies = set()
def get_next(self):
"""Get next working proxy from rotation."""
attempts = 0
while attempts < len(self.proxies):
proxy = self.proxies[self.current_index]
self.current_index = (self.current_index + 1) % len(self.proxies)
if proxy not in self.failed_proxies:
return proxy
attempts += 1
# Reset failed proxies if all are marked
self.failed_proxies.clear()
return self.proxies[0]
def mark_failed(self, proxy):
"""Mark proxy as failed for temporary exclusion."""
self.failed_proxies.add(proxy)
class IncapsulaProxyScraper:
"""
Scraper with integrated proxy rotation for Incapsula bypass.
"""
def __init__(self, proxies):
self.rotator = ProxyRotator(proxies)
def _create_session(self, proxy):
"""Create new session with proxy."""
session = Session(impersonate="chrome136")
return session, {'http': proxy, 'https': proxy}
def scrape(self, url, max_retries=3):
"""
Scrape URL with automatic proxy rotation on failure.
"""
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.9',
'sec-ch-ua': '"Google Chrome";v="136", "Chromium";v="136"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
}
for attempt in range(max_retries):
proxy = self.rotator.get_next()
session, proxies = self._create_session(proxy)
try:
# Random delay
time.sleep(random.uniform(1.0, 3.0))
response = session.get(
url,
headers=headers,
proxies=proxies,
timeout=30
)
# Check for blocks
if response.status_code == 403:
print(f"Blocked on proxy {proxy[:30]}...")
self.rotator.mark_failed(proxy)
continue
if 'incapsula incident' in response.text.lower():
print(f"Challenge on proxy {proxy[:30]}...")
self.rotator.mark_failed(proxy)
continue
return response
except Exception as e:
print(f"Error with proxy: {e}")
self.rotator.mark_failed(proxy)
return None
# Usage with Roundproxies residential proxies
proxies = [
'http://user:pass@gate.roundproxies.com:10001',
'http://user:pass@gate.roundproxies.com:10002',
'http://user:pass@gate.roundproxies.com:10003',
]
scraper = IncapsulaProxyScraper(proxies)
response = scraper.scrape('https://protected-site.com/data')
The proxy rotator automatically switches IPs when blocks occur. Residential IPs from providers like Roundproxies have high trust scores because they originate from real ISP networks.
Troubleshooting Common Errors
403 Forbidden Errors
This is the most common Incapsula block. Debug steps:
- Check TLS fingerprint - Ensure you're using curl_cffi with impersonation or a stealth browser
- Verify headers - Missing Sec-CH-UA headers trigger immediate blocks
- Switch proxy - Your current IP might be flagged
- Add delays - Rapid requests trigger rate limiting
def handle_403(response, session, url):
"""Handle 403 response with recovery attempt."""
if response.status_code != 403:
return response
print("403 detected - attempting recovery")
# Wait before retry
time.sleep(random.uniform(5.0, 10.0))
# Clear session and try again
session.cookies.clear()
# First request to homepage to build trust
homepage = url.split('/')[0] + '//' + url.split('/')[2]
session.get(homepage)
time.sleep(random.uniform(2.0, 4.0))
# Retry original request
return session.get(url)
CAPTCHA Challenges
CAPTCHAs indicate borderline trust scores:
- Improve fingerprint - Switch to headless browser with stealth
- Use better proxies - Residential over datacenter
- Add mouse movements - Simulate human interaction patterns
- Consider CAPTCHA solving services - For critical requests
Session Expiry
Incapsula sessions expire after inactivity:
class SessionManager:
"""Manage Incapsula session lifecycle."""
def __init__(self, session_lifetime=300):
self.session = None
self.last_request = 0
self.lifetime = session_lifetime
def get_session(self):
"""Get valid session, create new if expired."""
current_time = time.time()
if self.session is None or (current_time - self.last_request) > self.lifetime:
self.session = Session(impersonate="chrome136")
print("Created new session")
self.last_request = current_time
return self.session
Performance Comparison
| Method | Success Rate | Speed | Complexity | Best For |
|---|---|---|---|---|
| curl_cffi | 70-80% | Fast | Low | API endpoints |
| Playwright Stealth | 85-90% | Medium | Medium | JavaScript sites |
| nodriver | 90-95% | Medium | Medium | High security sites |
| Browser + Proxy | 95%+ | Slow | High | Maximum reliability |
curl_cffi works best for simpler Imperva configurations. Playwright handles JavaScript challenges. nodriver provides the highest bypass rates but requires more setup.
Ethical Considerations
These techniques should only be used for:
- Security research on your own properties
- Accessing publicly available data
- Competitive analysis of public information
- Integration testing where APIs aren't available
Always check robots.txt and Terms of Service. Implement rate limiting to avoid server overload. Never scrape personal data or content behind authentication.
Advanced Technique: Combining HTTP and Browser Sessions
A powerful approach combines fast HTTP requests with browser-established trust. Use the browser to generate valid cookies, then switch to curl_cffi for speed.
Hybrid Session Implementation
import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import stealth_async
from curl_cffi.requests import Session
import json
class HybridIncapsulaBypass:
"""
Combine browser trust establishment with fast HTTP requests.
Browser handles initial challenge, HTTP handles bulk scraping.
"""
def __init__(self, proxy=None):
self.proxy = proxy
self.session = None
self.cookies = {}
async def establish_trust(self, url):
"""
Use browser to establish trusted session with Incapsula.
Returns cookies that can be used in HTTP requests.
"""
async with async_playwright() as p:
browser_args = ['--disable-blink-features=AutomationControlled']
if self.proxy:
browser = await p.chromium.launch(
headless=True,
args=browser_args,
proxy={'server': self.proxy}
)
else:
browser = await p.chromium.launch(
headless=True,
args=browser_args
)
context = await browser.new_context(
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36'
)
page = await context.new_page()
await stealth_async(page)
# Navigate and wait for challenge resolution
await page.goto(url, wait_until='networkidle', timeout=60000)
await asyncio.sleep(3)
# Simulate human behavior
await page.mouse.move(200, 200)
await page.mouse.move(400, 300)
await page.evaluate('window.scrollBy(0, 200)')
# Extract cookies
cookies = await context.cookies()
await browser.close()
# Convert to dict format
self.cookies = {c['name']: c['value'] for c in cookies}
return self.cookies
def create_http_session(self):
"""
Create HTTP session with browser-established cookies.
"""
self.session = Session(impersonate="chrome136")
# Apply cookies from browser session
for name, value in self.cookies.items():
self.session.cookies.set(name, value)
return self.session
def http_get(self, url, headers=None):
"""
Make fast HTTP request using established trust.
"""
if not self.session:
raise ValueError("Call establish_trust() and create_http_session() first")
default_headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.9',
'sec-ch-ua': '"Google Chrome";v="136", "Chromium";v="136"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
}
if headers:
default_headers.update(headers)
proxies = {'http': self.proxy, 'https': self.proxy} if self.proxy else None
return self.session.get(
url,
headers=default_headers,
proxies=proxies,
timeout=30
)
# Usage
async def main():
bypass = HybridIncapsulaBypass(proxy="http://user:pass@residential-proxy:8080")
# Establish trust with browser
cookies = await bypass.establish_trust('https://protected-site.com')
print(f"Got {len(cookies)} cookies")
# Create HTTP session
bypass.create_http_session()
# Fast scraping with established trust
urls = [
'https://protected-site.com/page1',
'https://protected-site.com/page2',
'https://protected-site.com/page3'
]
for url in urls:
response = bypass.http_get(url)
print(f"{url}: {response.status_code}")
asyncio.run(main())
This hybrid approach offers the best of both worlds. The browser handles JavaScript challenges and builds trust. HTTP requests then use that trust for fast bulk operations.
User Agent and Header Rotation
Static fingerprints get detected over time. Rotate user agents and headers to appear as different users.
Header Rotation Implementation
import random
class HeaderRotator:
"""
Rotate browser fingerprints to avoid detection patterns.
"""
# Chrome versions on Windows
CHROME_WINDOWS = [
{
'user_agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
'sec_ch_ua': '"Google Chrome";v="136", "Chromium";v="136", "Not-A.Brand";v="8"',
'impersonate': 'chrome136'
},
{
'user_agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36',
'sec_ch_ua': '"Google Chrome";v="133", "Chromium";v="133", "Not-A.Brand";v="24"',
'impersonate': 'chrome133a'
},
{
'user_agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
'sec_ch_ua': '"Google Chrome";v="131", "Chromium";v="131", "Not-A.Brand";v="24"',
'impersonate': 'chrome131'
}
]
# Chrome versions on macOS
CHROME_MACOS = [
{
'user_agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36',
'sec_ch_ua': '"Google Chrome";v="136", "Chromium";v="136", "Not-A.Brand";v="8"',
'sec_ch_ua_platform': '"macOS"',
'impersonate': 'chrome136'
}
]
# Firefox versions (no sec-ch-ua headers)
FIREFOX = [
{
'user_agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:134.0) Gecko/20100101 Firefox/134.0',
'impersonate': 'firefox'
}
]
def __init__(self, prefer_chrome=True):
if prefer_chrome:
self.profiles = self.CHROME_WINDOWS + self.CHROME_MACOS
else:
self.profiles = self.CHROME_WINDOWS + self.CHROME_MACOS + self.FIREFOX
def get_random(self):
"""Get random browser profile."""
return random.choice(self.profiles)
def build_headers(self, profile=None):
"""Build complete headers for profile."""
if profile is None:
profile = self.get_random()
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.9',
'accept-encoding': 'gzip, deflate, br',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': profile['user_agent']
}
# Add Chrome-specific headers
if 'sec_ch_ua' in profile:
headers['sec-ch-ua'] = profile['sec_ch_ua']
headers['sec-ch-ua-mobile'] = '?0'
headers['sec-ch-ua-platform'] = profile.get('sec_ch_ua_platform', '"Windows"')
return headers, profile.get('impersonate', 'chrome136')
# Usage
rotator = HeaderRotator()
for i in range(5):
headers, impersonate = rotator.build_headers()
print(f"Request {i+1}: {headers['user-agent'][:50]}... (impersonate: {impersonate})")
Rotating fingerprints prevents pattern recognition. Use different Chrome versions and operating systems across requests. Match the curl_cffi impersonate parameter to your user agent.
Async Scraping for High Volume
For large-scale scraping, async operations dramatically improve throughput while respecting rate limits.
import asyncio
from curl_cffi.requests import AsyncSession
import random
class AsyncIncapsulaScraper:
"""
High-performance async scraper for Incapsula-protected sites.
"""
def __init__(self, proxies, concurrency=5):
self.proxies = proxies
self.concurrency = concurrency
self.semaphore = asyncio.Semaphore(concurrency)
async def scrape_url(self, session, url, proxy):
"""Scrape single URL with rate limiting."""
async with self.semaphore:
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.9',
'sec-ch-ua': '"Google Chrome";v="136", "Chromium";v="136"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
}
# Random delay per request
await asyncio.sleep(random.uniform(1.0, 3.0))
try:
response = await session.get(
url,
headers=headers,
proxy=proxy,
timeout=30
)
return {
'url': url,
'status': response.status_code,
'content': response.text if response.status_code == 200 else None,
'error': None
}
except Exception as e:
return {
'url': url,
'status': None,
'content': None,
'error': str(e)
}
async def scrape_batch(self, urls):
"""Scrape multiple URLs concurrently."""
async with AsyncSession(impersonate="chrome136") as session:
tasks = []
for i, url in enumerate(urls):
# Rotate proxies
proxy = self.proxies[i % len(self.proxies)]
task = self.scrape_url(session, url, proxy)
tasks.append(task)
results = await asyncio.gather(*tasks)
return results
# Usage
async def main():
proxies = [
'http://user:pass@residential1:8080',
'http://user:pass@residential2:8080',
'http://user:pass@residential3:8080'
]
scraper = AsyncIncapsulaScraper(proxies, concurrency=3)
urls = [f'https://protected-site.com/page/{i}' for i in range(10)]
results = await scraper.scrape_batch(urls)
successful = sum(1 for r in results if r['status'] == 200)
print(f"Success rate: {successful}/{len(urls)}")
asyncio.run(main())
The semaphore controls concurrent requests to avoid overwhelming the target. Proxy rotation distributes load across IP addresses. Random delays prevent timing-based detection.
2026 Detection Updates to Watch
Imperva Incapsula continues evolving. Here are emerging detection methods to monitor:
JA4 Fingerprinting: The successor to JA3 provides more granular TLS analysis. Libraries must update impersonation to match new browser signatures.
HTTP/3 Analysis: As HTTP/3 adoption grows, Incapsula analyzes QUIC protocol fingerprints. Current bypass tools need HTTP/3 support.
Behavioral ML Models: Machine learning models detect subtle patterns in navigation timing, click sequences, and scroll behavior. Simple randomization may not suffice.
Device Attestation: Some implementations verify hardware characteristics through WebAuthn or similar APIs. This requires actual browser execution.
Canvas Fingerprint Verification: Beyond collection, some systems verify canvas render consistency across requests from the same session.
Stay updated on curl_cffi releases, Playwright updates, and anti-detection plugin changes. Join web scraping communities to learn about new detection methods early.
Key Takeaways
Imperva Incapsula detection works through multiple layers: TLS fingerprinting, IP reputation, HTTP analysis, JavaScript challenges, and behavioral monitoring. Effective bypass requires addressing all vectors simultaneously.
Start with curl_cffi for basic requests. The TLS impersonation handles fingerprint detection without browser overhead. Add residential proxies to solve IP reputation issues.
For JavaScript-heavy sites, use Playwright or Puppeteer with stealth plugins. These execute challenges natively while hiding automation markers. nodriver provides even better stealth through direct CDP communication.
The reese84 cookie challenge requires browser execution. HTTP-only approaches cannot generate the required fingerprint payload. Plan for browser automation when targeting sites with reese84.
Distribute requests across multiple IPs and browser profiles. Imperva's behavioral analysis flags patterns from single sources. Residential proxy rotation combined with randomized timing produces the most consistent results.
Stay updated on detection changes. Imperva Incapsula evolves their protection regularly. What works today may trigger blocks next month. Monitor success rates and adjust techniques as needed.