You've built your scraper. It runs flawlessly on test pages. Then you point it at a protected enterprise site and everything falls apart.
Your requests return cryptic challenge pages. Your IP gets flagged. The rbzid cookie never validates.
Welcome to Reblaze.
Reblaze protects thousands of enterprise websites, APIs, and web applications globally. It's deployed on AWS, Azure, and Google Cloud as a full-stack security layer.
If you're scraping financial services, travel sites, or enterprise SaaS platforms, you'll encounter it eventually.
In this guide, I'll show you 7 proven methods to bypass Reblaze—from simple header adjustments to advanced behavioral emulation. Each method has trade-offs, so I'll help you pick the right one for your specific situation.
What is Reblaze and Why Does It Block Scrapers?
Reblaze is a cloud-based Web Application Firewall (WAF) and bot mitigation platform. It sits in front of web servers as a reverse proxy, analyzing every request before it reaches the origin.
Unlike simpler protection systems, Reblaze uses a multi-layered detection approach. It doesn't rely on a single technique—it combines several methods simultaneously.
When your scraper sends a request to a Reblaze-protected site, the platform analyzes multiple signals:
IP and Network Analysis Reblaze checks IP reputation, detects VPNs, proxies, TOR exit nodes, and cloud platform IPs. Known datacenter ranges get flagged immediately.
Browser Environment Detection The platform injects JavaScript challenges that verify your browser environment. It checks for automation markers like navigator.webdriver and headless browser signatures.
Signature Detection Request patterns, header configurations, and known bot fingerprints trigger instant blocks. Default Selenium or Puppeteer configurations fail here.
Behavioral Analysis This is where Reblaze differs from competitors. It uses machine learning to build behavioral profiles—tracking mouse movements, click patterns, scroll behavior, typing speed, and session timing.
Cookie-Based Tracking Reblaze sets an rbzid cookie to track sessions. Requests without valid authentication cookies face additional challenges.
Why Standard Scrapers Fail
Standard scraping tools fail Reblaze checks because they lack legitimate browser characteristics.
A basic requests call doesn't execute JavaScript. Selenium exposes automation flags. Even headless browsers leak detectable signals through missing APIs and unnatural behavior patterns.
Reblaze identifies these gaps and blocks the request—sometimes silently, sometimes with a challenge page.
Reblaze Protection Levels
Reblaze offers different protection intensities:
ACL Filtering Basic IP and network-based filtering. Easiest to bypass with good proxies.
Active Challenges JavaScript redirects that require browser execution. Moderate difficulty.
Passive Challenges with Biometric Detection Full behavioral analysis including mouse tracking and interaction patterns. Hardest to bypass.
The methods below address each protection level.
7 Methods to Bypass Reblaze
Before diving into implementations, here's a quick comparison:
| Method | Difficulty | Cost | Best For | Success Rate |
|---|---|---|---|---|
| Header Optimization | Easy | Free | Basic ACL filtering | Medium |
| Session & Cookie Management | Easy | Free | Maintaining auth state | Medium |
| Residential Proxy Rotation | Medium | $ | IP-based blocks | High |
| Puppeteer Stealth | Medium | Free | JS challenges | High |
| Nodriver | Medium | Free | Advanced detection | Very High |
| Behavioral Emulation | Hard | Free | Biometric checks | Very High |
| TLS Fingerprint Spoofing | Hard | Free | Advanced fingerprinting | High |
Quick recommendation: Start with Method 1 (headers) plus Method 3 (residential proxies). If you're still blocked, move to Method 4 or 5 for browser automation.
Basic Methods
1. Header Optimization
Optimize your HTTP headers to mimic legitimate browser traffic.
Difficulty: Easy
Cost: Free
Success rate: Medium (works against basic ACL filtering)
How it works
Reblaze analyzes HTTP headers to identify bot traffic. Default scraper headers are obvious red flags.
A real browser sends dozens of headers in a specific order. Missing headers, wrong values, or unusual ordering triggers suspicion.
The goal is making your requests indistinguishable from Chrome or Firefox traffic.
Implementation
import requests
from fake_useragent import UserAgent
def create_browser_headers():
"""
Generate headers that mimic a real Chrome browser.
Order matters - Reblaze checks header sequence.
"""
ua = UserAgent()
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Sec-Ch-Ua': '"Chromium";v="122", "Not(A:Brand";v="24", "Google Chrome";v="122"',
'Sec-Ch-Ua-Mobile': '?0',
'Sec-Ch-Ua-Platform': '"Windows"',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': ua.chrome,
}
return headers
def scrape_with_headers(url):
"""
Make a request with optimized headers.
"""
session = requests.Session()
headers = create_browser_headers()
response = session.get(
url,
headers=headers,
timeout=30
)
return response
The code above creates a header set that matches Chrome 122. The Sec-Ch-Ua headers are client hints that modern browsers send.
Missing these headers immediately identifies your request as non-browser traffic.
Key headers explained
The Sec-Fetch-* headers tell the server about request context. Real browsers always include them.
Sec-Ch-Ua identifies the browser brand and version. Reblaze validates this against the User-Agent string.
Header order affects detection. Some WAFs flag requests where headers appear in unusual sequences.
Pros and cons
Pros:
- Zero cost
- Easy to implement
- Works for basic protection
Cons:
- Fails against JavaScript challenges
- Headers alone won't pass behavioral analysis
- Requires regular updates as browser versions change
When to use this method
Use header optimization when:
- You're scraping sites with light protection
- Requests work in a browser but fail with basic scripts
- You want a quick fix before trying advanced methods
Avoid this method if:
- The site shows challenge pages
- You're getting
403responses even with good headers - The site requires JavaScript execution
2. Session and Cookie Management
Properly manage sessions and preserve the rbzid cookie across requests.
Difficulty: Easy
Cost: Free
Success rate: Medium (essential complement to other methods)
How it works
Reblaze sets an rbzid cookie after successful challenge completion. This cookie identifies your session as verified.
Subsequent requests must include this cookie. Without it, you'll face challenges repeatedly.
A persistent session also maintains other cookies and connection state that Reblaze tracks.
Implementation
import requests
import pickle
import os
class ReblazeSessions:
"""
Manage sessions with persistent cookie storage.
Preserves rbzid and other auth cookies across requests.
"""
def __init__(self, cookie_file='reblaze_cookies.pkl'):
self.cookie_file = cookie_file
self.session = requests.Session()
self._load_cookies()
def _load_cookies(self):
"""Load cookies from file if they exist."""
if os.path.exists(self.cookie_file):
with open(self.cookie_file, 'rb') as f:
self.session.cookies.update(pickle.load(f))
def _save_cookies(self):
"""Persist cookies to file for reuse."""
with open(self.cookie_file, 'wb') as f:
pickle.dump(self.session.cookies, f)
def get(self, url, headers=None):
"""
Make GET request with session persistence.
"""
response = self.session.get(url, headers=headers, timeout=30)
# Check if we received the rbzid cookie
if 'rbzid' in self.session.cookies:
print(f"[+] rbzid cookie acquired: {self.session.cookies['rbzid'][:20]}...")
self._save_cookies()
return response
def has_valid_session(self):
"""Check if we have an rbzid cookie."""
return 'rbzid' in self.session.cookies
# Usage example
scraper = ReblazeSessions()
# First request - may trigger challenge
response = scraper.get('https://target-site.com')
# If challenge passed, subsequent requests use saved cookies
if scraper.has_valid_session():
response = scraper.get('https://target-site.com/data')
This code creates a session manager that persists cookies between runs. Once you pass a challenge (manually or through other methods), the session remains valid.
Handling cookie expiration
Reblaze cookies have expiration times. Your code should handle refreshes:
import time
from datetime import datetime, timedelta
class CookieManager:
def __init__(self):
self.session = requests.Session()
self.last_refresh = None
self.refresh_interval = timedelta(minutes=30)
def needs_refresh(self):
"""Check if session needs refreshing."""
if self.last_refresh is None:
return True
return datetime.now() - self.last_refresh > self.refresh_interval
def refresh_session(self, url, browser_func):
"""
Refresh session using browser automation.
browser_func should return cookies from a real browser session.
"""
if self.needs_refresh():
new_cookies = browser_func(url)
self.session.cookies.update(new_cookies)
self.last_refresh = datetime.now()
Pros and cons
Pros:
- Reduces challenge frequency
- Works with any bypass method
- Simple to implement
Cons:
- Requires initial challenge bypass
- Cookies expire and need refreshing
- One session per IP/fingerprint
When to use this method
Use session management when:
- You've successfully passed a challenge once
- You're making multiple requests to the same site
- You want to reduce detection triggers
This method complements other techniques—it's rarely sufficient alone.
Intermediate Methods
3. Residential Proxy Rotation
Route requests through residential IPs to bypass IP-based detection.
Difficulty: Medium
Cost: $-$$
Success rate: High
How it works
Reblaze maintains databases of datacenter IPs, VPN endpoints, and known proxy ranges. Requests from these sources face extra scrutiny.
Residential proxies use IPs assigned to real home internet connections. They appear as legitimate user traffic.
Rotating IPs prevents rate limiting and makes your requests look like distinct users.
Implementation
For residential proxies, I recommend Roundproxies.com which offers residential, datacenter, ISP, and mobile proxy options. Here's how to integrate rotating proxies:
import requests
from itertools import cycle
import random
import time
class ProxyRotator:
"""
Rotate through residential proxies for each request.
Supports authenticated proxy endpoints.
"""
def __init__(self, proxy_endpoint, username, password):
self.proxy_url = f"http://{username}:{password}@{proxy_endpoint}"
self.session = requests.Session()
self.request_count = 0
def get_proxy_config(self):
"""Return proxy configuration for requests."""
return {
'http': self.proxy_url,
'https': self.proxy_url
}
def make_request(self, url, headers=None):
"""
Make request through rotating proxy.
Each request gets a new IP from the pool.
"""
proxies = self.get_proxy_config()
try:
response = self.session.get(
url,
headers=headers,
proxies=proxies,
timeout=30
)
self.request_count += 1
return response
except requests.exceptions.ProxyError as e:
print(f"Proxy error: {e}")
# Retry with delay
time.sleep(random.uniform(2, 5))
return self.make_request(url, headers)
# Usage
rotator = ProxyRotator(
proxy_endpoint="gate.rproxies.com:10000",
username="your_username",
password="your_password"
)
response = rotator.make_request('https://target-site.com')
Geo-targeting for better results
Some sites are region-specific. Using proxies from the expected geography improves success rates:
def get_geo_proxy(country_code):
"""
Get proxy endpoint for specific country.
Most providers support country targeting.
"""
# Example format - varies by provider
return f"http://user-country-{country_code}:pass@proxy.example.com:port"
# Target US traffic
us_proxy = get_geo_proxy('US')
Handling proxy failures
Residential proxies occasionally fail. Build in retry logic:
def request_with_retry(url, proxies, max_retries=3):
"""Make request with automatic retry on proxy failure."""
for attempt in range(max_retries):
try:
response = requests.get(
url,
proxies=proxies,
timeout=30
)
# Check for block indicators
if response.status_code == 403:
print(f"Blocked on attempt {attempt + 1}, rotating...")
time.sleep(random.uniform(1, 3))
continue
return response
except Exception as e:
print(f"Attempt {attempt + 1} failed: {e}")
time.sleep(random.uniform(2, 5))
return None
Pros and cons
Pros:
- High success against IP-based detection
- IPs appear as legitimate users
- Scales well for large scraping jobs
Cons:
- Ongoing cost
- Slower than direct connections
- Still fails against JS/behavioral checks
When to use this method
Use residential proxies when:
- You're getting blocked despite good headers
- Target site heavily filters datacenter IPs
- You need to scrape at scale
Combine with header optimization for best results.
4. Puppeteer Stealth
Use fortified headless browsers to pass JavaScript challenges.
Difficulty: Medium
Cost: Free
Success rate: High
How it works
Reblaze injects JavaScript challenges that verify browser environments. Standard headless browsers fail these checks.
Puppeteer Stealth is a plugin that patches detection points. It modifies browser properties to match legitimate Chrome behavior.
The plugin handles navigator.webdriver, Chrome runtime objects, missing permissions APIs, and other giveaways.
Implementation
First install the required packages:
npm install puppeteer-extra puppeteer-extra-plugin-stealth
Then implement the stealth scraper:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
// Apply stealth plugin
puppeteer.use(StealthPlugin());
async function scrapeWithStealth(url) {
// Launch browser with stealth configuration
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-blink-features=AutomationControlled',
'--disable-features=site-per-process',
'--window-size=1920,1080'
]
});
const page = await browser.newPage();
// Set realistic viewport
await page.setViewport({
width: 1920,
height: 1080,
deviceScaleFactor: 1,
hasTouch: false,
isLandscape: true,
isMobile: false
});
// Set user agent
await page.setUserAgent(
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' +
'(KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36'
);
try {
// Navigate and wait for network idle
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 60000
});
// Wait for any challenge to resolve
await page.waitForTimeout(3000);
// Check for challenge indicators
const content = await page.content();
if (content.includes('window.rbzns')) {
console.log('Challenge detected, waiting...');
await page.waitForTimeout(5000);
}
// Get cookies including rbzid
const cookies = await page.cookies();
const rbzid = cookies.find(c => c.name === 'rbzid');
if (rbzid) {
console.log('Successfully obtained rbzid cookie');
}
// Extract page content
const html = await page.content();
return {
html,
cookies,
success: true
};
} catch (error) {
console.error('Scraping failed:', error.message);
return { success: false, error: error.message };
} finally {
await browser.close();
}
}
// Run scraper
scrapeWithStealth('https://target-site.com')
.then(result => {
if (result.success) {
console.log('Page length:', result.html.length);
}
});
Adding proxy support
Combine stealth with residential proxies:
async function scrapeWithProxy(url, proxyUrl) {
const browser = await puppeteer.launch({
headless: 'new',
args: [
`--proxy-server=${proxyUrl}`,
'--no-sandbox',
'--disable-blink-features=AutomationControlled'
]
});
const page = await browser.newPage();
// Authenticate proxy if needed
await page.authenticate({
username: 'proxy_user',
password: 'proxy_pass'
});
// Continue with scraping...
}
Pros and cons
Pros:
- Passes JavaScript challenges
- Handles dynamic content
- Active community maintaining evasions
Cons:
- Slower than HTTP requests
- Higher resource usage
- May still fail biometric checks
When to use this method
Use Puppeteer Stealth when:
- Simple requests return challenge pages
- Target site heavily uses JavaScript
- You need to interact with page elements
5. Nodriver
Use Nodriver for superior detection evasion compared to traditional headless browsers.
Difficulty: Medium
Cost: Free
Success rate: Very High
How it works
Nodriver is the successor to undetected-chromedriver. It takes a fundamentally different approach.
Instead of patching automation flags after they're set, Nodriver avoids setting them entirely. It communicates with Chrome without using the Chrome DevTools Protocol (CDP) in detectable ways.
This makes it significantly harder for Reblaze to identify automation.
Implementation
Install Nodriver:
pip install nodriver
Basic implementation:
import nodriver as uc
import asyncio
async def scrape_with_nodriver(url):
"""
Scrape using Nodriver for maximum stealth.
Nodriver avoids CDP detection patterns.
"""
# Launch browser
browser = await uc.start(
headless=False, # Headed mode is more stealthy
browser_args=[
'--window-size=1920,1080',
'--disable-blink-features=AutomationControlled'
]
)
try:
# Create new tab
page = await browser.get(url)
# Wait for page to fully load
await page.sleep(3)
# Check for Reblaze challenge
content = await page.get_content()
if 'rbzns' in content or 'challenge' in content.lower():
print("Challenge detected, waiting for resolution...")
await page.sleep(5)
content = await page.get_content()
# Extract cookies
cookies = await browser.cookies.get_all()
rbzid_cookie = next(
(c for c in cookies if c.name == 'rbzid'),
None
)
if rbzid_cookie:
print(f"rbzid acquired: {rbzid_cookie.value[:20]}...")
return {
'content': content,
'cookies': cookies,
'success': True
}
except Exception as e:
print(f"Error: {e}")
return {'success': False, 'error': str(e)}
finally:
browser.stop()
# Run the scraper
async def main():
result = await scrape_with_nodriver('https://target-site.com')
if result['success']:
print(f"Content length: {len(result['content'])}")
asyncio.run(main())
Advanced configuration
For tougher sites, customize browser behavior:
async def advanced_nodriver_scrape(url):
"""
Advanced Nodriver configuration for difficult targets.
"""
browser = await uc.start(
headless=False,
browser_args=[
'--window-size=1920,1080',
'--start-maximized',
'--disable-blink-features=AutomationControlled',
'--disable-features=IsolateOrigins,site-per-process'
],
lang='en-US'
)
page = await browser.get(url)
# Simulate human-like behavior before interaction
await page.sleep(2)
# Scroll the page naturally
await page.evaluate('''
window.scrollTo({
top: 300,
behavior: 'smooth'
});
''')
await page.sleep(1)
# Move mouse to random position
await page.mouse.move(
x=500 + (100 * (0.5 - 0.5)), # Add randomness
y=300 + (100 * (0.5 - 0.5))
)
await page.sleep(3)
content = await page.get_content()
return content
Combining with proxies
Route Nodriver through residential proxies:
async def nodriver_with_proxy(url, proxy):
"""
Use Nodriver with proxy rotation.
"""
browser = await uc.start(
headless=False,
browser_args=[
f'--proxy-server={proxy}',
'--window-size=1920,1080'
]
)
page = await browser.get(url)
# Continue scraping...
Pros and cons
Pros:
- Higher success rate than Puppeteer/Selenium
- Avoids CDP detection patterns
- Actively maintained
- Python-native (easier for data pipelines)
Cons:
- Requires GUI environment (or Xvfb)
- Newer tool with smaller community
- Still resource-intensive
When to use this method
Use Nodriver when:
- Puppeteer Stealth gets detected
- Target uses advanced fingerprinting
- You need the highest possible success rate
Advanced Methods
6. Behavioral Emulation
Simulate human-like interactions to pass biometric behavioral checks.
Difficulty: Hard
Cost: Free
Success rate: Very High
How it works
Reblaze's biometric detection tracks mouse movements, click patterns, scroll behavior, typing speed, and interaction timing.
Bots typically exhibit inhuman patterns—instant movements, perfect timing, lack of micro-movements.
Behavioral emulation generates realistic human patterns using libraries like ghost-cursor for mouse movements and randomized timing.
Implementation
Install dependencies:
npm install puppeteer-extra puppeteer-extra-plugin-stealth ghost-cursor
Implement behavioral emulation:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
const { createCursor } = require('ghost-cursor');
puppeteer.use(StealthPlugin());
async function humanLikeScrape(url) {
const browser = await puppeteer.launch({
headless: false, // Headed for behavioral tracking
args: [
'--window-size=1920,1080',
'--disable-blink-features=AutomationControlled'
]
});
const page = await browser.newPage();
// Create cursor instance for human-like movements
const cursor = createCursor(page);
await page.setViewport({ width: 1920, height: 1080 });
// Navigate to page
await page.goto(url, { waitUntil: 'networkidle2' });
// Wait random time like a human would
await randomDelay(1000, 3000);
// Move mouse to random position with human-like curve
await cursor.moveTo({
x: randomInt(200, 800),
y: randomInt(200, 600)
});
await randomDelay(500, 1500);
// Scroll down naturally
await smoothScroll(page, 300);
await randomDelay(1000, 2000);
// Move mouse again
await cursor.moveTo({
x: randomInt(400, 1000),
y: randomInt(300, 700)
});
// Random click (if appropriate)
await cursor.click();
await randomDelay(2000, 4000);
const content = await page.content();
await browser.close();
return content;
}
// Helper functions
function randomInt(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
function randomDelay(min, max) {
return new Promise(resolve =>
setTimeout(resolve, randomInt(min, max))
);
}
async function smoothScroll(page, distance) {
await page.evaluate((dist) => {
return new Promise((resolve) => {
let scrolled = 0;
const step = 10;
const interval = setInterval(() => {
window.scrollBy(0, step);
scrolled += step;
if (scrolled >= dist) {
clearInterval(interval);
resolve();
}
}, 20 + Math.random() * 30);
});
}, distance);
}
Keyboard input emulation
For forms or search boxes:
async function humanLikeType(page, selector, text) {
/**
* Type text with human-like delays between keystrokes.
* Varies speed like a real typist.
*/
await page.click(selector);
await randomDelay(200, 500);
for (const char of text) {
await page.keyboard.type(char, {
delay: randomInt(50, 150)
});
// Occasional longer pause (like thinking)
if (Math.random() < 0.1) {
await randomDelay(200, 500);
}
}
}
Session recording patterns
Study legitimate user patterns on the target site:
async function recordSessionPatterns(page) {
/**
* Record mouse/keyboard events to analyze patterns.
* Use this data to improve emulation.
*/
await page.evaluate(() => {
window.sessionEvents = [];
document.addEventListener('mousemove', (e) => {
window.sessionEvents.push({
type: 'move',
x: e.clientX,
y: e.clientY,
time: Date.now()
});
});
document.addEventListener('click', (e) => {
window.sessionEvents.push({
type: 'click',
x: e.clientX,
y: e.clientY,
time: Date.now()
});
});
document.addEventListener('scroll', () => {
window.sessionEvents.push({
type: 'scroll',
y: window.scrollY,
time: Date.now()
});
});
});
}
Pros and cons
Pros:
- Defeats biometric behavioral analysis
- Combined with stealth browsers, very effective
- No ongoing costs
Cons:
- Significantly slower
- Complex to implement well
- Requires headed browser (more resources)
When to use this method
Use behavioral emulation when:
- Other methods get blocked after initial success
- Target site uses passive challenges
- You see patterns suggesting behavioral analysis
7. TLS Fingerprint Spoofing
Spoof TLS fingerprints to match legitimate browsers.
Difficulty: Hard
Cost: Free
Success rate: High
How it works
Every HTTPS connection begins with a TLS handshake. The client sends a "ClientHello" message containing supported cipher suites, extensions, and version information.
Each browser (and HTTP library) has a unique TLS fingerprint. Python's requests library has a different fingerprint than Chrome.
Reblaze can identify non-browser connections through these fingerprints alone.
Libraries like curl_cffi and tls_client spoof browser TLS fingerprints.
Implementation
Install tls-client:
pip install tls-client
Use it to match Chrome's fingerprint:
import tls_client
def scrape_with_tls_spoofing(url):
"""
Use TLS fingerprint spoofing to match Chrome.
This bypasses TLS-based bot detection.
"""
# Create session with Chrome fingerprint
session = tls_client.Session(
client_identifier="chrome_120", # Match Chrome 120
random_tls_extension_order=True
)
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Sec-Ch-Ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'Sec-Ch-Ua-Mobile': '?0',
'Sec-Ch-Ua-Platform': '"Windows"',
}
response = session.get(url, headers=headers)
return response
# Usage
response = scrape_with_tls_spoofing('https://target-site.com')
print(f"Status: {response.status_code}")
print(f"Content length: {len(response.text)}")
Using curl_cffi (alternative)
Another option with good browser impersonation:
from curl_cffi import requests
def scrape_with_curl_cffi(url):
"""
Use curl_cffi for browser-like TLS fingerprints.
Impersonates various browser versions.
"""
response = requests.get(
url,
impersonate="chrome120", # Options: chrome, safari, firefox
headers={
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
}
)
return response
Combining with proxies
TLS spoofing works well with proxy rotation:
import tls_client
def tls_spoof_with_proxy(url, proxy_url):
"""
Combine TLS spoofing with residential proxies.
"""
session = tls_client.Session(
client_identifier="chrome_120"
)
# Parse proxy for tls_client format
proxies = {
"http": proxy_url,
"https": proxy_url
}
response = session.get(
url,
proxy=proxy_url,
headers=create_browser_headers()
)
return response
Pros and cons
Pros:
- Bypasses TLS fingerprinting
- Fast (no browser overhead)
- Works for non-JS pages
Cons:
- Doesn't execute JavaScript
- Can't pass browser challenges
- Library support varies
When to use this method
Use TLS fingerprint spoofing when:
- Basic requests fail despite good headers
- You don't need JavaScript execution
- Speed is important
Combine with browser automation for complete coverage.
Which Bypass Method Should You Use?
Choosing the right method depends on your target site's protection level and your requirements.
Decision flowchart
Is the site returning HTML with basic requests?
├── Yes → Use Method 1 (Headers) + Method 3 (Proxies)
└── No → Does the response contain JavaScript challenges?
├── Yes → Use Method 4 (Puppeteer) or Method 5 (Nodriver)
└── No → Getting 403/blocked instantly?
├── Yes → Use Method 7 (TLS) + Method 3 (Proxies)
└── No → Blocked after initial success?
└── Yes → Add Method 6 (Behavioral Emulation)
Quick reference by situation
| Situation | Recommended Methods |
|---|---|
| Basic blocks, fast scraping needed | 1 + 3 + 7 |
| JavaScript challenges | 4 or 5 + 3 |
| Getting caught after passing challenge | 5 + 6 + 3 |
| Maximum stealth required | 5 + 6 + 3 + 7 |
Troubleshooting Common Issues
"403 Forbidden" immediately
Cause: IP blacklisted or TLS fingerprint flagged.
Fix:
- Switch to residential proxies
- Add TLS fingerprint spoofing
- Verify headers match browser exactly
Challenge page never resolves
Cause: JavaScript challenge failing or biometric check triggered.
Fix:
- Use Nodriver instead of Puppeteer
- Add behavioral emulation
- Run in headed mode (not headless)
Session invalidated between requests
Cause: rbzid cookie not persisting or expired.
Fix:
- Implement proper cookie persistence
- Refresh sessions before expiration
- Ensure cookies are sent with every request
Blocked after several successful requests
Cause: Rate limiting or behavioral anomaly detection.
Fix:
- Increase delays between requests
- Add random variation to timing
- Rotate IPs more frequently
"Server: Reblaze Secure Web Gateway" but page loads
Cause: Reblaze is present but monitoring only.
Fix:
- Continue with current method
- Monitor for changes
- The site may have light protection
Ethical Considerations
Before bypassing Reblaze or any protection system, consider:
Respect robots.txt and ToS
Most sites have Terms of Service that address scraping. Violating these may have legal implications.
Check robots.txt for scraping guidelines. Many sites explicitly allow certain scrapers.
Use responsibly
Do:
- Scrape public data for legitimate purposes
- Implement rate limiting even when you can avoid it
- Cache data to minimize requests
- Identify yourself with a contact email when appropriate
Don't:
- Scrape personal or private data without consent
- Overload servers with requests
- Resell scraped data without rights
- Bypass protection on government, healthcare, or financial sites inappropriately
When to use official APIs instead
If a site offers an API, use it. APIs are:
- Faster and more reliable
- Legal and ToS-compliant
- Often free for reasonable usage
Recommended Tools and Resources
Libraries and frameworks
| Tool | Language | Best For |
|---|---|---|
| Nodriver | Python | Maximum stealth browser automation |
| Puppeteer Stealth | Node.js | JavaScript challenge bypass |
| tls_client | Python | TLS fingerprint spoofing |
| curl_cffi | Python | Browser impersonation |
| ghost-cursor | Node.js | Human-like mouse movements |
| fake-useragent | Python | User-Agent rotation |
Proxy providers
For residential proxies, consider providers that offer:
- Large IP pools
- Geographic targeting
- Session control
- Competitive pricing
Conclusion
Reblaze is a sophisticated WAF with multi-layered detection. Bypassing it requires combining multiple techniques.
Start simple: Headers + residential proxies work for many sites.
Escalate when needed: Add Nodriver or Puppeteer Stealth for JavaScript challenges.
Go advanced sparingly: Behavioral emulation is powerful but slow—save it for tough targets.
Quick reference
| Protection Level | Solution Stack |
|---|---|
| Light (ACL only) | Headers + Proxies |
| Medium (JS challenges) | Nodriver + Proxies |
| Heavy (Biometric) | Nodriver + Behavioral + Proxies |
The key is starting with the simplest effective method and escalating only when necessary.