How to Use Nodriver for Web Scraping in 7 Steps

Looking to scrape modern websites that block traditional web scraping tools? You're not alone. As websites implement increasingly sophisticated anti-bot measures like Cloudflare, traditional tools like Selenium often fail to get past even basic protection.

This is where Nodriver comes in - the official successor to Undetected-Chromedriver that's revolutionizing how we approach web scraping in 2025.

I've personally used Nodriver to successfully scrape data from heavily protected sites that blocked every other tool I tried. In this guide, I'll show you the exact process that helped me bypass detection and extract data reliably.

What you'll learn:

  • How to install and set up Nodriver properly
  • Essential configuration for avoiding detection
  • Techniques for scraping both static and dynamic content
  • Methods to handle pagination and infinite scrolling
  • Best practices for staying undetected
  • How to handle common challenges and errors

Why You Can Trust This Method

Problem: Modern websites use sophisticated anti-bot systems that detect and block traditional scrapers. Selenium with ChromeDriver leaves obvious fingerprints that are easily detected.

Solution: Nodriver provides direct browser communication without the traditional webdriver, making it much harder to detect while offering better performance.

Proof: Nodriver is "optimized to stay undetected for most anti-bot solutions" and has been specifically designed as a "successor to Undetected-Chromedriver". It successfully bypasses protection from Cloudflare, Imperva, and other major WAFs that would normally block scrapers.

Step 1: Install Nodriver and Set Up Your Environment

First, let's get Nodriver installed and prepare your development environment.

Prerequisites

  • Python 3.12 or higher
  • Chrome browser installed (preferably in the default location)
  • Basic knowledge of Python and asyncio

Installation

Install Nodriver using pip:

pip install nodriver

Set up your project structure

Create a new directory for your project and set up a virtual environment:

mkdir nodriver-scraper
cd nodriver-scraper
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate

Common pitfall to avoid

When running on a headless machine, like AWS or any other environment where no display is present, it's best to use some Xvfb tool, to emulate a screen. This prevents display-related errors.

Step 2: Create Your First Nodriver Browser Instance

Now let's create a basic Nodriver script that launches a browser.

import nodriver as uc

async def main():
    # Start the browser
    browser = await uc.start()
    
    # Create a new tab
    tab = await browser.get('https://example.com')
    
    # Wait a moment for the page to load
    await tab.wait(2)
    
    # Take a screenshot to verify it worked
    await tab.save_screenshot('example.png')
    
    # Close the browser
    await browser.close()

if __name__ == '__main__':
    # Run the async function
    uc.loop().run_until_complete(main())

Tips for this step

  • Nodriver provides direct communication with browsers, eliminating the need for traditional components like Selenium or Chromedriver binaries
  • The browser runs in non-headless mode by default, which is less detectable
  • Use headless=True parameter in uc.start() only when necessary

Advanced configuration options

browser = await uc.start(
    headless=False,  # Run with GUI (more stealthy)
    user_data_dir="/path/to/profile",  # Use existing browser profile
    browser_args=['--disable-blink-features=AutomationControlled'],
    lang="en-US"
)

Step 3: Navigate and Extract Basic Data

Let's scrape some actual data from a webpage.

import nodriver as uc

async def scrape_products():
    browser = await uc.start()
    
    # Navigate to the target page
    tab = await browser.get('https://scrapingcourse.com/ecommerce/')
    
    # Wait for products to load
    await tab.wait(2)
    
    # Extract product information
    products = []
    
    # Find all product containers
    product_elements = await tab.select_all('li.product')
    
    for product in product_elements:
        # Extract product details
        name_elem = await product.query_selector('h2')
        price_elem = await product.query_selector('.price')
        
        if name_elem and price_elem:
            name = await name_elem.text
            price = await price_elem.text
            
            products.append({
                'name': name,
                'price': price
            })
    
    print(f"Found {len(products)} products")
    for product in products:
        print(f"- {product['name']}: {product['price']}")
    
    await browser.close()
    return products

if __name__ == '__main__':
    uc.loop().run_until_complete(scrape_products())

Best practices for element selection

  • Use CSS selectors for finding elements: tab.select() for single elements and tab.select_all() for multiple
  • Always check if elements exist before trying to extract data
  • Use await element.text to get text content

Common pitfall to avoid

Don't use overly specific selectors that might break if the website structure changes slightly. Prefer class names and data attributes over complex hierarchical selectors.

Step 4: Handle Dynamic Content and JavaScript

Many modern websites load content dynamically. Here's how to handle that:

import nodriver as uc

async def scrape_dynamic_content():
    browser = await uc.start()
    tab = await browser.get('https://example.com/dynamic-page')
    
    # Wait for specific element to appear
    await tab.wait_for('div.dynamic-content', timeout=10)
    
    # Execute JavaScript if needed
    result = await tab.evaluate('document.querySelector(".counter").innerText')
    print(f"Counter value: {result}")
    
    # Interact with the page
    button = await tab.select('button.load-more')
    if button:
        await button.click()
        # Wait for new content to load
        await tab.wait(2)
    
    # Scroll to trigger lazy loading
    await tab.scroll_down(500)
    
    await browser.close()

if __name__ == '__main__':
    uc.loop().run_until_complete(scrape_dynamic_content())

Tips for handling dynamic content

  • Use wait_for() to wait for specific elements to appear
  • Nodriver can execute JavaScript, so it's useful for extracting data from dynamic websites
  • Always add appropriate waits after interactions

Step 5: Implement Pagination and Scrolling

Here's how to handle pagination and infinite scrolling:

Pagination Example

async def scrape_with_pagination():
    browser = await uc.start()
    tab = await browser.get('https://scrapingcourse.com/ecommerce/')
    
    all_products = []
    page_num = 1
    
    while True:
        print(f"Scraping page {page_num}...")
        
        # Extract products from current page
        products = await tab.select_all('li.product')
        for product in products:
            name = await product.query_selector('h2')
            if name:
                all_products.append(await name.text)
        
        # Check for next page button
        next_button = await tab.select('a.next')
        if not next_button:
            print("No more pages")
            break
            
        # Click next page
        await next_button.click()
        await tab.wait(2)  # Wait for page to load
        page_num += 1
    
    print(f"Total products scraped: {len(all_products)}")
    await browser.close()
    return all_products

Infinite Scrolling Example

async def scrape_infinite_scroll():
    browser = await uc.start()
    tab = await browser.get('https://scrapingcourse.com/infinite-scrolling')
    
    products = []
    last_height = 0
    
    while True:
        # Scroll to bottom
        await tab.evaluate('window.scrollTo(0, document.body.scrollHeight)')
        await tab.wait(2)  # Wait for new content to load
        
        # Check if new content was loaded
        new_height = await tab.evaluate('document.body.scrollHeight')
        if new_height == last_height:
            print("No more content to load")
            break
        
        last_height = new_height
        
        # Extract newly loaded products
        product_elements = await tab.select_all('.product-item')
        current_count = len(product_elements)
        print(f"Products loaded: {current_count}")
    
    # Extract all product data
    for elem in product_elements:
        name = await elem.query_selector('.product-name')
        price = await elem.query_selector('.product-price')
        if name and price:
            products.append({
                'name': await name.text,
                'price': await price.text
            })
    
    await browser.close()
    return products

Best practice tip

Ensure you wait for more elements to load before scraping to avoid missing dynamically loaded content.

Step 6: Save and Export Your Data

Once you've scraped the data, you need to save it properly:

import csv
import json
import nodriver as uc
from datetime import datetime

async def scrape_and_save():
    browser = await uc.start()
    tab = await browser.get('https://scrapingcourse.com/ecommerce/')
    
    # Scrape data
    products = []
    product_elements = await tab.select_all('li.product')
    
    for elem in product_elements:
        name = await elem.query_selector('h2')
        price = await elem.query_selector('.price')
        image = await elem.query_selector('img')
        
        if all([name, price, image]):
            products.append({
                'name': await name.text,
                'price': await price.text,
                'image_url': await image.get_attribute('src'),
                'scraped_at': datetime.now().isoformat()
            })
    
    # Save to CSV
    with open('products.csv', 'w', newline='', encoding='utf-8') as f:
        writer = csv.DictWriter(f, fieldnames=['name', 'price', 'image_url', 'scraped_at'])
        writer.writeheader()
        writer.writerows(products)
    
    # Save to JSON
    with open('products.json', 'w', encoding='utf-8') as f:
        json.dump(products, f, indent=2, ensure_ascii=False)
    
    print(f"Saved {len(products)} products to CSV and JSON files")
    
    await browser.close()

if __name__ == '__main__':
    uc.loop().run_until_complete(scrape_and_save())

Tips for data export

  • Always include timestamps in your scraped data
  • Use UTF-8 encoding to handle special characters
  • Consider using pandas for more complex data manipulation
  • Implement error handling for file operations

Step 7: Scale Your Scraper with Best Practices

To build a production-ready scraper, implement these essential practices:

Error Handling and Retries

import asyncio
import nodriver as uc
from typing import Optional, List, Dict

async def safe_scrape(url: str, max_retries: int = 3) -> Optional[List[Dict]]:
    """Scrape with automatic retry on failure"""
    for attempt in range(max_retries):
        try:
            browser = await uc.start()
            tab = await browser.get(url)
            
            # Your scraping logic here
            await tab.wait_for('.product', timeout=10)
            products = await extract_products(tab)
            
            await browser.close()
            return products
            
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            if attempt < max_retries - 1:
                await asyncio.sleep(2 ** attempt)  # Exponential backoff
            else:
                print("Max retries reached")
                return None
        finally:
            try:
                await browser.close()
            except:
                pass

Rate Limiting and Delays

import random

async def respectful_scraper(urls: List[str]):
    """Scrape multiple URLs with random delays"""
    browser = await uc.start()
    results = []
    
    for url in urls:
        tab = await browser.get(url)
        
        # Random delay between requests (1-3 seconds)
        delay = random.uniform(1, 3)
        await tab.wait(delay)
        
        # Extract data
        data = await extract_data(tab)
        results.append(data)
        
        # Close tab to free memory
        await tab.close()
    
    await browser.close()
    return results

Session Management and Cookies

async def scrape_with_session():
    """Use persistent session with saved cookies"""
    browser = await uc.start(
        user_data_dir="./browser_profile"  # Saves cookies and session
    )
    
    # First visit - might need to log in
    tab = await browser.get('https://example.com/login')
    
    # Check if already logged in
    if not await tab.select('.user-dashboard'):
        # Perform login
        await login(tab)
    
    # Now scrape protected content
    await tab.get('https://example.com/protected-data')
    data = await extract_protected_data(tab)
    
    await browser.close()
    return data

Common pitfalls to avoid

  1. Don't name your script "nodriver.py" - this will cause import errors
  2. Avoid aggressive scraping - respect robots.txt and implement delays
  3. Don't ignore errors - proper error handling prevents data loss
  4. Check for CloudFlare challenges - even Nodriver may encounter Cloudflare captchas on heavily protected sites

Pro tips for staying undetected

  • Nodriver has nice utility where the get function on the browser just works on the primary tab
  • Rotate user agents and browser profiles
  • Use residential proxies for heavily protected sites
  • Implement human-like behavior (random delays, mouse movements)

Final thoughts

You've now learned how to use Nodriver for web scraping, from basic setup to advanced techniques. The key to successful scraping with Nodriver is understanding its asynchronous nature and leveraging its built-in anti-detection features.

Remember that what makes this package different from other known packages, is the optimization to stay undetected for most anti-bot solutions. However, always scrape responsibly and respect website terms of service.

Next steps

  • Explore advanced features: Try using Nodriver's xpath selectors and CDP (Chrome DevTools Protocol) access
  • Build a monitoring system: Create scrapers that run on schedule and alert you to changes
  • Scale with cloud deployment: Deploy your scrapers on AWS or similar platforms using Docker
  • Handle complex scenarios: Learn to bypass more sophisticated protections and handle multi-step processes

Further Reading

Call-to-action

Ready to start scraping? Download the complete code examples from this tutorial and begin extracting data from even the most protected websites. If you need to handle extremely complex anti-bot systems, consider using a web scraping API like ZenRows or ScrapingBee as a fallback option.

Marius Bernard

Marius Bernard

Marius Bernard is a Product Advisor, Technical SEO, & Brand Ambassador at Roundproxies. He was the lead author for the SEO chapter of the 2024 Web and a reviewer for the 2023 SEO chapter.