How to Use Puppeteer Extra in 5 Steps (2025)

Puppeteer Extra is like Puppeteer’s more capable cousin. It brings a modular plugin system into the mix, giving you the tools you need to tackle the real-world challenges of web scraping—whether it’s stealth mode, ad blocking, solving CAPTCHAs, or bypassing anti-bot walls.

If you've ever run into roadblocks using plain Puppeteer, this guide is going to be your new best friend.

Let’s be honest: web scraping today isn’t as simple as it used to be. Many modern websites are on high alert for bots, and basic Puppeteer setups are often flagged and blocked almost immediately. You might’ve already hit issues with CAPTCHAs, sluggish performance from loading too many ads, or even full-on Cloudflare blocks.

Standard Puppeteer is great for automation—but when it comes to scraping websites that fight back, it just doesn’t cut it. You’re up against:

  • Anti-bot systems that flag your scripts
  • Persistent CAPTCHA prompts
  • Ad-heavy pages that waste time and bandwidth
  • Browser fingerprinting traps
  • Authentication flows that seem impossible to get past

That’s where Puppeteer Extra shines. In this guide, we’ll walk you through 5 essential steps to get the most out of Puppeteer Extra. You’ll learn how to install it, set it up with powerful plugins, and build a scraping setup that’s hard to detect and easy to scale.

Contents

❖ Why You Can Trust This Guide
❖ Step 1: Install and Set Up Puppeteer Extra
❖ Step 2: Master the Stealth Plugin for Avoiding Detection
❖ Step 3: Implement Automatic CAPTCHA Solving
❖ Step 4: Optimize Performance with Resource Blocking
❖ Step 5: Combine Multiple Plugins for Advanced Scraping
❖ Next Steps

Why You Can Trust This Guide

Here’s the deal—websites are evolving fast. They’re using smarter tools to detect bots and shut down scraping scripts. That’s the challenge.

The good news? Puppeteer Extra is designed to meet that challenge head-on. Its plugin system has been tested against real-world obstacles and has proven effective across a wide range of websites.

Thousands of developers use these exact techniques in production scraping tools. So yes, this guide is built on experience—and what actually works.

Step 1: Install and Set Up Puppeteer Extra

Before you can unleash the power of Puppeteer Extra, you need to get it installed and ready to go. The setup is quick, and once you’ve done it, you’ll have access to the whole plugin ecosystem.

Basic Installation

npm install puppeteer puppeteer-extra
# or using yarn
yarn add puppeteer puppeteer-extra

Your First Puppeteer Extra Script

const puppeteer = require('puppeteer-extra');

(async () => {
  // Launch browser with puppeteer-extra
  const browser = await puppeteer.launch({
    headless: false, // Set to true for production
    defaultViewport: null
  });
  
  const page = await browser.newPage();
  await page.goto('https://example.com');
  
  // Take a screenshot to verify it's working
  await page.screenshot({ path: 'test.png' });
  
  await browser.close();
})();

TypeScript Support

import puppeteer from 'puppeteer-extra';

// TypeScript will automatically infer types
const browser = await puppeteer.launch();

Pro Tip: Develop with headless: false so you can watch what’s going on. Save headless: true for production once everything’s dialed in.

Step 2: Master the Stealth Plugin for Avoiding Detection

The Stealth plugin is a game-changer. It helps disguise your automation as a real human browser session by tweaking and masking telltale signs that bots usually leave behind.

Installing the Stealth Plugin

npm install puppeteer-extra-plugin-stealth

Basic Stealth Configuration

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');

puppeteer.use(StealthPlugin());

(async () => {
  const browser = await puppeteer.launch({ 
    headless: true,
    args: ['--no-sandbox']
  });
  
  const page = await browser.newPage();
  
  await page.goto('https://bot.sannysoft.com');
  await page.waitForTimeout(5000);
  await page.screenshot({ path: 'stealth-test.png', fullPage: true });
  
  await browser.close();
})();

What the Stealth Plugin Does

It quietly disables or modifies a bunch of browser characteristics that typically scream “bot.” That includes:

  • Removing the navigator.webdriver flag
  • Tweaking the user agent to look more human
  • Faking plugin details
  • Overriding permission prompts
  • Masking WebGL info
  • Fixing subtle layout quirks

Advanced Stealth Configuration

const StealthPlugin = require('puppeteer-extra-plugin-stealth');

const stealth = StealthPlugin();
stealth.enabledEvasions.delete('user-agent-override');

puppeteer.use(stealth);

Heads-up: Even the best evasion tricks won’t always work on tough systems like Cloudflare. Sometimes, you’ll need a backup plan—like rotating proxies or scraping APIs.

Step 3: Implement Automatic CAPTCHA Solving

Let’s face it—CAPTCHAs are one of the most annoying scraping hurdles. Fortunately, Puppeteer Extra can handle them automatically using third-party services.

Installing the Recaptcha Plugin

npm install puppeteer-extra-plugin-recaptcha

Setting Up CAPTCHA Solving

const puppeteer = require('puppeteer-extra');
const RecaptchaPlugin = require('puppeteer-extra-plugin-recaptcha');

puppeteer.use(
  RecaptchaPlugin({
    provider: {
      id: '2captcha',
      token: 'YOUR_2CAPTCHA_API_KEY'
    },
    visualFeedback: true
  })
);

(async () => {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();
  
  await page.goto('https://www.google.com/recaptcha/api2/demo');
  await page.solveRecaptchas();
  
  await page.waitForNavigation();
  
  console.log('CAPTCHA solved and form submitted!');
  await browser.close();
})();

Handling Multiple CAPTCHAs and Frames

for (const frame of page.mainFrame().childFrames()) {
  await frame.solveRecaptchas();
}

Error Handling

try {
  const { solved, solutions } = await page.solveRecaptchas();
  
  if (solved.length === 0) {
    console.log('No CAPTCHAs found on the page');
  } else {
    console.log(`Solved ${solved.length} CAPTCHAs`);
  }
} catch (error) {
  console.error('CAPTCHA solving failed:', error);
}

Pro Tip: Keep an eye on your solving service costs. If you're scraping high-traffic sites, the CAPTCHA solving fees can add up quickly.

Step 4: Optimize Performance with Resource Blocking

Why waste time downloading stuff you don’t need? Most sites serve images, fonts, ads, and other fluff that can slow you down. Here's how to block it smartly.

Installing Performance Plugins

npm install puppeteer-extra-plugin-adblocker puppeteer-extra-plugin-block-resources

Using the Adblocker Plugin

const puppeteer = require('puppeteer-extra');
const AdblockerPlugin = require('puppeteer-extra-plugin-adblocker');

puppeteer.use(AdblockerPlugin({ blockTrackers: true }));

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.goto('https://www.example-with-ads.com');
  
  const content = await page.evaluate(() => document.body.innerText);
  console.log(content);
  
  await browser.close();
})();

Selective Resource Blocking

const BlockResourcesPlugin = require('puppeteer-extra-plugin-block-resources');

puppeteer.use(
  BlockResourcesPlugin({
    blockedTypes: new Set(['image', 'stylesheet', 'font', 'media']),
    interceptResolutionPriority: 1
  })
);

page.on('request', (request) => {
  const url = request.url();
  
  if (url.includes('doubleclick.net') || url.includes('google-analytics.com')) {
    request.abort();
  } else {
    request.continue();
  }
});

Performance Monitoring

const startTime = Date.now();

await page.goto('https://heavy-website.com', {
  waitUntil: 'networkidle0'
});

const loadTime = Date.now() - startTime;
console.log(`Page loaded in ${loadTime}ms`);

const metrics = await page.metrics();
console.log('Page metrics:', metrics);

Watch out: Don’t block essential scripts that the page needs to display your target data.

Step 5: Combine Multiple Plugins for Advanced Scraping

This is where it all comes together. You’ll see how to blend plugins—stealth, CAPTCHA solving, ad-blocking—into one powerful scraper that can hold its own against just about anything.

Complete Advanced Scraping Setup

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
const RecaptchaPlugin = require('puppeteer-extra-plugin-recaptcha');
const AdblockerPlugin = require('puppeteer-extra-plugin-adblocker');
const AnonymizeUAPlugin = require('puppeteer-extra-plugin-anonymize-ua');

// Configure all plugins
puppeteer.use(StealthPlugin());
puppeteer.use(AdblockerPlugin({ blockTrackers: true }));
puppeteer.use(AnonymizeUAPlugin());
puppeteer.use(
  RecaptchaPlugin({
    provider: {
      id: '2captcha',
      token: process.env.CAPTCHA_API_KEY
    },
    visualFeedback: true
  })
);

async function scrapePage(url) {
  const browser = await puppeteer.launch({
    headless: true,
    args: [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--disable-dev-shm-usage',
      '--disable-accelerated-2d-canvas',
      '--no-first-run',
      '--no-zygote',
      '--disable-gpu'
    ]
  });
  
  try {
    const page = await browser.newPage();
    
    // Set viewport and user agent
    await page.setViewport({ width: 1920, height: 1080 });
    
    // Enable request interception for additional control
    await page.setRequestInterception(true);
    
    page.on('request', (request) => {
      // Additional custom blocking logic if needed
      request.continue();
    });
    
    // Navigate with timeout
    await page.goto(url, {
      waitUntil: 'networkidle0',
      timeout: 30000
    });
    
    // Solve any CAPTCHAs
    await page.solveRecaptchas();
    
    // Wait for content to load
    await page.waitForSelector('body', { timeout: 10000 });
    
    // Extract data
    const data = await page.evaluate(() => {
      // Your extraction logic here
      return {
        title: document.title,
        content: document.body.innerText
      };
    });
    
    return data;
  } catch (error) {
    console.error('Scraping failed:', error);
    throw error;
  } finally {
    await browser.close();
  }
}

// Usage with error handling and retries
async function scrapeWithRetry(url, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      const data = await scrapePage(url);
      return data;
    } catch (error) {
      console.log(`Attempt ${i + 1} failed, retrying...`);
      if (i === maxRetries - 1) throw error;
      
      // Wait before retrying
      await new Promise(resolve => setTimeout(resolve, 2000 * (i + 1)));
    }
  }
}

// Example usage
(async () => {
  try {
    const data = await scrapeWithRetry('https://example.com');
    console.log('Scraped data:', data);
  } catch (error) {
    console.error('All retries failed:', error);
  }
})();

Additional Useful Plugins

  • puppeteer-extra-plugin-proxy: Use proxies to rotate IPs
  • puppeteer-extra-plugin-user-preferences: Set custom browser behaviors
  • puppeteer-extra-plugin-devtools: Great for debugging with DevTools

Plugin Compatibility

Plugins typically play well together, but:

  • Some might override the same browser settings
  • Too many plugins can slow things down
  • Always test your setup thoroughly

Final Thoughts

Puppeteer Extra unlocks a whole new level of scraping power. Whether it’s flying under the radar with stealth, cutting through CAPTCHA roadblocks, or speeding up your scrapes by skipping unnecessary resources—it’s built for serious scraping.

That said, the web is always evolving. The tactics that work today might not work tomorrow. So stay nimble, keep learning, and always respect the rules (like robots.txt and terms of service).

Marius Bernard

Marius Bernard

Marius Bernard is a Product Advisor, Technical SEO, & Brand Ambassador at Roundproxies. He was the lead author for the SEO chapter of the 2024 Web and a reviewer for the 2023 SEO chapter.