Playwright vs. Selenium in 2025

The main difference between Playwright and Selenium is their underlying architecture. Playwright uses persistent WebSocket connections via the Chrome DevTools Protocol (CDP) for direct browser communication. Selenium relies on the WebDriver API over HTTP requests, adding an intermediary translation layer. This architectural gap translates to 35-45% faster execution for Playwright, lower memory consumption, and native network interception capabilities.

This isn't a features comparison—it's a deep technical breakdown. We'll cover protocol-level differences, real benchmark data, stealth techniques for scraping, and when to skip browsers entirely.

Why Protocol Architecture Actually Matters

Protocol choices determine everything about your automation stack. Speed, reliability under load, network interception capabilities, and what anti-bot defenses can detect.

Selenium's approach: Your test script sends HTTP requests to a WebDriver server (ChromeDriver on port 9515, for example). That driver then translates those commands into CDP or equivalent browser-specific protocols.

Playwright's approach: Your script maintains a persistent WebSocket connection directly to the browser using CDP for Chromium, plus custom integrations for Firefox and WebKit.

Every additional hop adds latency. More importantly, it creates state drift opportunities and expands your detectable surface area.

In practice, this determines whether your scraper finishes in 30 minutes or 3 hours.

Selenium's Communication Chain Explained

Selenium's four-step communication chain looks simple on the surface. But trace a single click through the entire flow:

  1. Test Script → WebDriver API
  2. WebDriver API → Browser Driver (ChromeDriver, GeckoDriver)
  3. Browser Driver → Browser (translated to CDP)
  4. Browser → Driver → Script (response returns)

Here's what a simple click actually triggers:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.find_element(By.ID, "submit").click()

Under the hood, this generates:

{
  "method": "POST",
  "url": "/session/xxx/element/yyy/click"
}

That JSON command travels via HTTP to ChromeDriver. ChromeDriver translates it to CDP. Chrome executes. Response bubbles back through all layers.

Measured latency: A simple element click averages ~536ms with Selenium versus ~290ms with Playwright on identical hardware.

That's nearly 2x slower per action.

Selenium 4.33: What's New in 2025-2026

Selenium 4.33 introduced several improvements worth noting:

  • Live node previews in Grid UI
  • BiDi webExtension module for deeper browser access
  • Improved Docker/Kubernetes integration for Dynamic Grid
  • W3C WebDriver BiDi support expansion

The BiDi protocol brings Selenium closer to Playwright's capabilities. You can now intercept console messages and network traffic natively.

But the fundamental HTTP-based architecture remains unchanged.


Playwright's Direct CDP Approach

Playwright maintains an always-on WebSocket to the browser. Commands bypass the middleman entirely.

// Playwright - direct WebSocket message
await page.click('#submit');
// No HTTP overhead, no driver translation

This architecture enables capabilities impossible in Selenium:

  • Native route interception without plugins
  • Request blocking at the protocol level
  • Context isolation without launching new browser instances
  • Auto-waiting built into every action

Playwright 1.57: The 2025-2026 Updates

Playwright 1.57 brought major changes:

  • Chrome for Testing builds instead of Chromium (better detection profiles)
  • Playwright Agents for LLM-driven test generation
  • ARIA snapshot assertions for accessibility testing
  • Trace grouping for visual debugging
  • fail-on-flaky-tests CLI flag

The shift to Chrome for Testing is significant for scraping. Your automation now uses the same browser binaries as real users.

Speed Benchmarks: Real Numbers from Production

These benchmarks ran on identical hardware (16GB RAM, 2.6GHz) against the same dynamic e-commerce site. 100 iterations per tool.

Metric Selenium 4.33 Playwright 1.57
Page Load (avg) 2.8s 1.9s
Element Click 536ms 290ms
Form Fill (10 fields) 4.2s 2.1s
Full Page Screenshot 1.1s 0.6s
Memory per Instance 380MB 215MB

JavaScript-Heavy SPA Results

The gap widens dramatically on React/Vue/Angular applications:

Test Suite Selenium Playwright Playwright + Route Blocking
500 Pages ~60 min 35 min 18 min
Memory Peak 2.8GB 1.6GB 1.2GB
Flaky Tests 12% 3% 2%

Network interception alone cuts execution time by 50% on media-heavy sites.

Bypassing Bot Detection in 2026

Modern anti-bot systems don't just check navigator.webdriver. They correlate hundreds of signals:

  • CDP command patterns
  • WebSocket fingerprints
  • Timing fingerprints
  • GPU/codec characteristics
  • TLS fingerprints
  • Mouse movement patterns

Why Default Selenium Gets Blocked

Out-of-the-box Selenium leaks obvious fingerprints:

// Detection vectors in default Selenium
navigator.webdriver = true;  // Dead giveaway
window.cdc_adoQpoasnfa76pfcZLmcfl_Array;  // ChromeDriver property
navigator.plugins.length = 0;  // Headless marker

These properties exist because automation tools are required to be detectable for legitimate testing scenarios.

Undetected ChromeDriver for Selenium

import undetected_chromedriver as uc

options = uc.ChromeOptions()
options.add_argument("--disable-blink-features=AutomationControlled")

driver = uc.Chrome(options=options)
driver.get("https://target-site.com")

This patches most detection vectors automatically.

Success rates (approximate):

Protection Level Success Rate
Basic bot detection ~95%
Cloudflare (standard) ~70%
DataDome ~35%
PerimeterX ~30%

Playwright Stealth Mode

from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    stealth_sync(page)  # Applies evasion patches
    page.goto("https://target-site.com")

The playwright-stealth library ports puppeteer-extra-plugin-stealth evasion modules to Playwright.

The CDP Detection Problem

Here's what most guides miss: CDP itself is detectable.

Advanced anti-bot systems watch for:

  • Runtime.enable command patterns
  • CDP connection signatures
  • WebSocket handshake characteristics

Opening Chrome DevTools on a detection test site often triggers bot flags. That's the same detection method used against automation.

Patchright: The CDP Patching Fork

Patchright modifies Playwright internals to avoid sending Runtime.enable:

from patchright.async_api import async_playwright

async def stealth_browse():
    async with async_playwright() as p:
        browser = await p.chromium.launch(
            channel="chrome",  # Real Chrome, not Chromium
            headless=False,
            args=["--disable-blink-features=AutomationControlled"]
        )
        context = await browser.new_context(
            viewport={"width": 1920, "height": 1080}
        )
        page = await context.new_page()
        await page.goto("https://cloudflare-protected-site.com")

This reduces detection on CreepJS from 100% to approximately 67%.

Still not bulletproof—but significantly better.

Network Interception: The Performance Hack Nobody Uses

Playwright's route interception is the single biggest optimization for JavaScript-heavy scraping.

Block Non-Essential Assets

// Block images, CSS, fonts - instant 40% speed boost
await page.route('**/*.{png,jpg,jpeg,gif,css,woff,woff2}', route => route.abort());

Keep Only API Responses

await page.route('**/*', route => {
    const type = route.request().resourceType();
    return ['document', 'xhr', 'fetch'].includes(type) 
        ? route.continue() 
        : route.abort();
});

This approach:

  • Reduces page load times by 30-50%
  • Cuts bandwidth by 60-80%
  • Improves parallel execution efficiency
  • Reduces memory pressure

Selenium's CDP Workaround (Limited)

Selenium 4 added CDP hooks, but they're clunky:

# Block images via CDP in Selenium (LOCAL ONLY)
driver.execute_cdp_cmd('Network.enable', {})
driver.execute_cdp_cmd('Network.setBlockedURLs', {
    'urls': ['*.jpg', '*.png', '*.gif', '*.css']
})

Critical limitation: This only works with local ChromeDriver. Remote WebDriver/Grid loses CDP access.

The HTTP-Only Alternative (Skip Browsers Entirely)

The fastest browser automation is no browser at all.

If the target exposes JSON endpoints or renders server-side, go straight to HTTP:

import httpx
from selectolax.parser import HTMLParser

# 10x faster than any browser automation
async with httpx.AsyncClient() as client:
    response = await client.get(
        'https://api.example.com/products',
        headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)...'}
    )
    
html = HTMLParser(response.text)
products = html.css('.product-card')

When to Skip Browsers

Skip browser automation when:

  • API endpoints are accessible or easily reverse-engineered
  • Content is server-rendered (not React/Vue/Angular)
  • No complex client-side interactions needed
  • Cost or scale constraints exist

HTTPX vs Requests

HTTPX offers advantages over the classic requests library:

Feature Requests HTTPX
HTTP/2 Support No Yes
Async Support No Yes
Connection Pooling Basic Advanced
Timeout Handling Basic Granular

HTTP/2 support alone can reduce block rates. Many anti-bot systems flag HTTP/1.1 connections from suspicious IPs.

curl_cffi for TLS Fingerprinting

When even HTTPX gets blocked, curl_cffi mimics real browser TLS fingerprints:

from curl_cffi import requests

# Mimics Chrome TLS fingerprint
response = requests.get(
    "https://protected-site.com",
    impersonate="chrome"
)

This bypasses TLS fingerprinting defenses that flag Python HTTP clients.

Real-World Decision Matrix

Use Playwright When:

  • Speed is critical: 35-45% faster execution
  • JavaScript-heavy targets: React, Vue, Angular SPAs
  • Resource constraints: 44% less memory per instance
  • Parallel scraping: Better efficiency and fewer timeouts
  • Network manipulation needed: First-class route interception

Stick with Selenium When:

  • Legacy browser support: Older Safari, IE11 scenarios
  • Existing infrastructure: Selenium Grid already deployed
  • Enterprise mandates: Org-wide Selenium standardization
  • Real device testing: Mobile device farms
  • Team expertise: Years of Selenium utilities built

Skip Both (Use HTTP) When:

  • API access possible: Reverse-engineered endpoints
  • Static HTML: Server-rendered content
  • Extreme scale: Cost-sensitive, thousands of pages
  • Simple data: No interactions needed

Advanced Techniques That Actually Work

Hybrid Approach: Browser Login, HTTP Scrape

Log in with Playwright, scrape with requests. Best of both worlds.

from playwright.async_api import async_playwright
import httpx

async def hybrid_scrape():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        
        # Handle login with browser
        await page.goto('https://example.com/login')
        await page.fill('#username', 'user')
        await page.fill('#password', 'pass')
        await page.click('#submit')
        
        # Extract cookies
        cookies = await page.context.cookies()
        await browser.close()
    
    # Switch to httpx for actual scraping (10x faster)
    async with httpx.AsyncClient() as client:
        for cookie in cookies:
            client.cookies.set(cookie['name'], cookie['value'])
        
        response = await client.get('https://example.com/api/data')
        return response.json()

Browsers handle the hard parts (auth, CAPTCHA, JS rendering). HTTP handles volume.

The CDP Bridge: Selenium + Playwright

Combine Selenium's familiarity with Playwright's network superpowers:

from selenium import webdriver
from playwright.sync_api import sync_playwright

# Start with Selenium
driver = webdriver.Chrome()
driver.get('https://example.com')

# Connect Playwright via CDP
playwright = sync_playwright().start()
browser = playwright.chromium.connect_over_cdp(
    f"http://localhost:{driver.service.port}"
)

# Use Playwright's superior API on Selenium's browser
page = browser.contexts[0].pages[0]
page.route('**/*.png', lambda route: route.abort())

Useful during migrations or when Grid infrastructure constraints exist.

Fingerprint Rotation at Scale

Rotate high-signal traits across browser contexts:

const contexts = [];
const userAgents = [/* array of real UAs */];
const locales = ['en-US', 'en-GB', 'de-DE', 'fr-FR'];
const timezones = ['America/New_York', 'Europe/London', 'Asia/Tokyo'];

for (let i = 0; i < 10; i++) {
    const context = await browser.newContext({
        viewport: { 
            width: 1920 + Math.floor(Math.random() * 100), 
            height: 1080 + Math.floor(Math.random() * 100)
        },
        userAgent: userAgents[Math.floor(Math.random() * userAgents.length)],
        locale: locales[Math.floor(Math.random() * locales.length)],
        timezoneId: timezones[Math.floor(Math.random() * timezones.length)],
    });
    contexts.push(context);
}

Pair with randomized delays and request throttling to mimic organic traffic.

Performance Optimization Tips

Playwright Optimization Checklist

  1. Block non-essential assets (images, fonts, analytics)
  2. Disable CSS animations via context settings
  3. Use page.waitForResponse instead of time.sleep()
  4. Prefer locators over CSS selectors for retry efficiency
  5. Minimize full page reloads using SPA transitions
// Optimized route blocking
await page.route('**/*', route => {
    const r = route.request();
    const type = r.resourceType();
    const url = r.url();

    if (type === 'document') return route.continue();
    if (['xhr', 'fetch'].includes(type)) {
        if (url.includes('/api/') || url.includes('/graphql')) {
            return route.continue();
        }
    }
    return route.abort();
});

Selenium Optimization Checklist

  1. Use CDP locally to block heavy assets
  2. Adopt undetected-chromedriver for stealth
  3. Run headful when headless gets blocked
  4. Spread load with realistic pacing
  5. Replace sleeps with explicit waits
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

# Explicit wait cuts flaky retries
wait = WebDriverWait(driver, 15)
element = wait.until(EC.element_to_be_clickable((By.ID, "submit")))
element.click()

Proxy Integration for Scale

When scraping at scale, proxies become mandatory. If you're using Roundproxies.com, here's integration for both tools:

Playwright with Residential Proxies

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
        proxy={
            "server": "http://proxy.roundproxies.com:port",
            "username": "user",
            "password": "pass"
        }
    )
    page = browser.new_page()
    page.goto("https://target-site.com")

Selenium with Rotating Proxies

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('--proxy-server=http://proxy.roundproxies.com:port')

driver = webdriver.Chrome(options=options)

Residential and mobile proxies reduce block rates by 60-80% compared to datacenter IPs.

Common Pitfalls (And How to Avoid Them)

Chasing Feature Lists Instead of Architecture

A shiny API doesn't change the physics of your network path. Evaluate the underlying protocol before comparing feature checkboxes.

Ignoring Detection Vectors

Even "stealth" plugins leave trails. Treat evasion as probabilistic, never guaranteed. Test against detection sites like BrowserScan and CreepJS regularly.

Over-Fetching Assets

Loading images, fonts, and CSS on a headless scraper torpedoes throughput. Block everything non-essential by default.

Global Sleep Statements

Replace time.sleep(2) with event-driven waits:

# Bad - wastes 2 seconds every time
time.sleep(2)
element.click()

# Good - waits only until ready
await page.wait_for_selector('#element', state='visible')
await page.click('#element')

One-Size-Fits-All Stacks

Mix Playwright, Selenium, and HTTP clients tactically. Different tools for different phases of the same workflow.

Browser Context Management

Playwright's browser contexts are a game-changer for parallel scraping. Each context is an isolated session with its own cookies, localStorage, and cache.

Creating Isolated Contexts

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    
    # Each context is fully isolated
    context1 = browser.new_context()
    context2 = browser.new_context()
    
    page1 = context1.new_page()
    page2 = context2.new_page()
    
    # Different sessions, same browser instance
    await page1.goto('https://site.com/user1')
    await page2.goto('https://site.com/user2')

Context creation takes milliseconds. Browser launch takes seconds. This architectural difference enables efficient parallel workflows.

Memory Comparison

Approach Memory per "Session"
New Selenium Browser ~380MB
New Playwright Browser ~215MB
New Playwright Context ~15MB

For 50 parallel sessions:

  • Selenium: ~19GB memory
  • Playwright (new browsers): ~10.7GB
  • Playwright (contexts): ~750MB + browser overhead

Context-based parallelism is 10-25x more memory efficient.

Handling Authentication at Scale

Session Storage Extraction

# Save auth state after login
storage = await context.storage_state(path="auth.json")

# Reuse auth state in new contexts
new_context = await browser.new_context(storage_state="auth.json")

This eliminates login overhead for subsequent scraping runs.

Multi-Account Rotation

auth_states = ["user1.json", "user2.json", "user3.json"]

async def scrape_with_rotation(urls):
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        
        for i, url in enumerate(urls):
            auth_file = auth_states[i % len(auth_states)]
            context = await browser.new_context(storage_state=auth_file)
            page = await context.new_page()
            
            await page.goto(url)
            # Extract data
            
            await context.close()

Rotate accounts to distribute rate limits across multiple authenticated sessions.

Error Handling and Retry Logic

Playwright Error Handling

from playwright.sync_api import sync_playwright, TimeoutError

async def resilient_scrape(url, max_retries=3):
    for attempt in range(max_retries):
        try:
            async with async_playwright() as p:
                browser = await p.chromium.launch()
                page = await browser.new_page()
                
                response = await page.goto(url, timeout=30000)
                
                if response.status == 403:
                    # Likely blocked - rotate proxy
                    raise Exception("Blocked - rotate proxy")
                
                content = await page.content()
                return content
                
        except TimeoutError:
            if attempt < max_retries - 1:
                await asyncio.sleep(2 ** attempt)  # Exponential backoff
                continue
            raise
        
        finally:
            await browser.close()

Selenium Retry Pattern

from selenium.common.exceptions import TimeoutException, WebDriverException
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
def resilient_selenium_scrape(url):
    driver = webdriver.Chrome()
    try:
        driver.get(url)
        return driver.page_source
    except TimeoutException:
        driver.quit()
        raise
    finally:
        driver.quit()

CI/CD Integration Patterns

GitHub Actions with Playwright

name: Scraping Pipeline
on:
  schedule:
    - cron: '0 */6 * * *'  # Every 6 hours

jobs:
  scrape:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npx playwright install chromium
      - run: npm run scrape
      - uses: actions/upload-artifact@v4
        with:
          name: scraped-data
          path: output/

Docker Deployment

FROM mcr.microsoft.com/playwright/python:v1.57.0-noble

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
CMD ["python", "scraper.py"]

Playwright's official Docker images include all browser dependencies pre-installed.

Debugging and Monitoring

Playwright Trace Viewer

Playwright's trace viewer captures every action, network request, and DOM snapshot:

# Enable tracing
await context.tracing.start(screenshots=True, snapshots=True)

# Your scraping code
await page.goto('https://example.com')
await page.click('#button')

# Save trace
await context.tracing.stop(path="trace.zip")

Open the trace file at trace.playwright.dev for visual debugging.

Selenium Logging

from selenium.webdriver.remote.remote_connection import LOGGER
import logging

LOGGER.setLevel(logging.DEBUG)

# Now see all WebDriver commands in logs
driver = webdriver.Chrome()

Monitoring Scraper Health

Track these metrics in production:

Metric Warning Threshold Critical Threshold
Success Rate <95% <85%
Avg Response Time >5s >10s
Memory Usage >80% >95%
Error Rate >5% >15%

Set up alerts when thresholds trigger. Early detection prevents cascading failures.

Real-World Case Studies

E-commerce Price Monitoring

Challenge: Monitor 50,000 product prices daily across 12 retailers with varying anti-bot protection.

Solution Stack:

  • Playwright with route blocking for JavaScript-heavy sites
  • HTTPX for static HTML retailers
  • Rotating residential proxies
  • Redis queue for URL distribution

Results:

  • 4-hour total scrape time (down from 18 hours with Selenium)
  • 97% success rate
  • $340/month infrastructure cost

Real Estate Data Aggregation

Challenge: Extract listings from 200+ local MLS sites, many with CAPTCHA protection.

Solution Stack:

  • Selenium with undetected-chromedriver for authenticated sections
  • Playwright for public listing pages
  • 2Captcha integration for CAPTCHA solving
  • PostgreSQL for deduplication

Results:

  • 2.3M listings processed weekly
  • 89% automation rate (11% required manual CAPTCHA solving)

Quick Reference: Copy-Paste Snippets

Playwright: Block Heavy Assets

await page.route('**/*.{png,jpg,jpeg,gif,css,woff,woff2}', r => r.abort());

Playwright: Keep Only API + Document

await page.route('**/*', r => {
    const t = r.request().resourceType();
    return (t === 'document' || t === 'xhr' || t === 'fetch')
        ? r.continue()
        : r.abort();
});

Selenium 4: Block Assets via CDP (Local)

driver.execute_cdp_cmd('Network.enable', {})
driver.execute_cdp_cmd('Network.setBlockedURLs', {
    'urls': ['*.png', '*.jpg', '*.gif', '*.css']
})

Hybrid: Login with Playwright, Scrape with HTTPX

# (See full example in Advanced Techniques section)
session = httpx.AsyncClient()
# Set cookies from Playwright context
response = await session.get('https://example.com/api/data')

Bridge: Connect Playwright over Selenium's CDP

browser = playwright.chromium.connect_over_cdp(
    f"http://localhost:{driver.service.port}"
)

FAQ

Is Playwright faster than Selenium for SPAs?

Yes. Benchmarks consistently show 35-45% faster execution on React/Vue/Angular applications. The WebSocket-based architecture eliminates HTTP overhead per action.

How does Playwright's WebSocket Protocol improve speed?

It removes the WebDriver HTTP translation layer. Commands go directly to the browser via CDP. Fewer hops means lower latency and better reliability under load.

Can Selenium 4 match Playwright's request blocking?

Partially, via execute_cdp_cmd() on local ChromeDriver. Remote WebDriver and Selenium Grid don't support CDP commands, limiting this capability in distributed setups.

What about detection by Cloudflare, DataDome, PerimeterX?

Undetected ChromeDriver helps with light-to-medium protection. For aggressive anti-bot systems, expect an ongoing arms race. Consider specialized tools like Patchright, or evaluate if HTTP-based scraping can bypass the browser entirely.

When should I skip browser automation entirely?

When the target has accessible API endpoints, server-rendered HTML, or easily reverse-engineered calls. HTTPX/requests with proper headers will be 10x faster and significantly cheaper at scale.

Final Verdict: It's Not Either/Or

The practical approach in 2026 isn't tribal—it's compositional:

  • Playwright for scraping: Speed, modern JS sites, native network interception, smaller memory footprint
  • Selenium for testing: Cross-browser breadth, entrenched Grid infrastructure, legacy compatibility
  • HTTPX/requests for APIs: When you can bypass the browser entirely
  • Specialized tools: When anti-bot pressure demands CDP patching or managed browser services

The biggest mistake? Choosing by features instead of architecture.

Playwright's WebSocket-first design isn't just "faster." It reshapes what's possible: reliable request shaping, higher concurrency, smarter evasion.

Selenium remains valuable where standards, org mandates, and device labs matter.

Smart teams stitch them together, measure ruthlessly, and let the workload decide.