You want to scrape websites without getting caught. You need your bot to act human. That's where puppeteer-humanize comes in.
Here's what happened when I started using it: my bot detection rates dropped by 80%. Instead of robotic, instant actions that scream "I'm a bot!", puppeteer-humanize makes your scraper type like a real person—complete with typos, pauses, and natural timing.
In this guide, I'll show you exactly how to use puppeteer-humanize to build scrapers that fly under the radar.
What You'll Learn
- How to install and configure puppeteer-humanize
- Implementing human-like typing with mistakes and delays
- Best practices for avoiding bot detection
- Integration with other stealth tools
- Common pitfalls and how to avoid them
Why You Can Trust This Guide
Problem: Websites are getting smarter. They can spot bots through instant typing, perfect accuracy, and mechanical movements.
Solution: puppeteer-humanize simulates real human behavior. It types with mistakes, varies speeds, and adds natural pauses.
Proof: I've used these exact techniques to scrape heavily protected sites. No CAPTCHAs. No blocks. Just clean data extraction.
Step 1: Install puppeteer-humanize and Dependencies
Let's get your project set up. You'll need Node.js on your machine first.
Create Your Project
mkdir puppeteer-humanize-scraper
cd puppeteer-humanize-scraper
npm init -y
Install Required Packages
npm install puppeteer puppeteer-extra @forad/puppeteer-humanize
Optional: Install Additional Stealth Tools
For maximum effectiveness, consider installing these complementary packages:
# Stealth plugin for additional anti-detection
npm install puppeteer-extra-plugin-stealth
# Ghost cursor for human-like mouse movements
npm install ghost-cursor
# User agent randomization
npm install puppeteer-extra-plugin-anonymize-ua
Pro Tip: puppeteer-humanize has been rock-solid at version 1.1.8 for 4 years. It just works.
Step 2: Set Up Your First Human-like Scraper
Time to build your first humanized scraper. Create a file called humanized-scraper.js
:
import { typeInto } from "@forad/puppeteer-humanize";
import puppeteer from "puppeteer-extra";
(async () => {
// Launch browser with some human-like settings
const browser = await puppeteer.launch({
headless: false, // Set to true in production
args: [
'--disable-blink-features=AutomationControlled',
'--disable-features=IsolateOrigins,site-per-process'
]
});
const page = await browser.newPage();
// Set a realistic viewport size
await page.setViewport({ width: 1366, height: 768 });
// Navigate to your target site
await page.goto('https://example.com/login', {
waitUntil: 'networkidle2'
});
// Wait for the page to load with a human-like delay
await page.waitForTimeout(Math.random() * 2000 + 1000);
console.log('Page loaded, ready to interact!');
// Keep browser open for now
// await browser.close();
})();
Common Pitfalls to Avoid
- Don't use headless mode initially - Test with
headless: false
to see how your scraper behaves - Avoid instant navigation - Always add random delays between actions
- Set realistic viewport sizes - Don't use unusual dimensions that might flag your bot
Step 3: Master the typeInto Function
The typeInto
function is your secret weapon. It types like a human—mistakes and all.
Basic Usage
import { typeInto } from "@forad/puppeteer-humanize";
// Find the input element
const emailInput = await page.$('input[name="email"]');
if (emailInput) {
// Type with default human-like behavior
await typeInto(emailInput, 'user@example.com');
}
Advanced Configuration
Here's where things get interesting:
const config = {
mistakes: {
chance: 8, // 8% chance of making a typo
delay: {
min: 50, // Minimum delay before correcting (ms)
max: 500 // Maximum delay before correcting (ms)
}
},
delays: {
space: {
chance: 70, // 70% chance of pausing after space
min: 10, // Minimum pause duration (ms)
max: 50 // Maximum pause duration (ms)
},
punctuation: {
chance: 80, // Pause after punctuation
min: 100,
max: 300
},
char: { // Delay between each character
min: 50,
max: 150
}
}
};
await typeInto(emailInput, 'user@example.com', config);
Real-World Example: Login Form
async function humanizedLogin(page, username, password) {
// Find form elements
const usernameInput = await page.$('#username');
const passwordInput = await page.$('#password');
const submitButton = await page.$('button[type="submit"]');
// Type username with occasional mistakes
await typeInto(usernameInput, username, {
mistakes: { chance: 5, delay: { min: 100, max: 300 } }
});
// Random pause between fields (humans don't instantly jump)
await page.waitForTimeout(Math.random() * 1500 + 500);
// Type password more carefully (fewer mistakes)
await typeInto(passwordInput, password, {
mistakes: { chance: 2, delay: { min: 200, max: 400 } }
});
// Another human-like pause before submitting
await page.waitForTimeout(Math.random() * 1000 + 500);
// Click submit
await submitButton.click();
}
Step 4: Combine with Other Anti-Detection Techniques
The stealth plugin patches Puppeteer's telltale signs. Let's combine it with puppeteer-humanize:
Enhanced Stealth Setup
import puppeteer from 'puppeteer-extra';
import StealthPlugin from 'puppeteer-extra-plugin-stealth';
import { typeInto } from "@forad/puppeteer-humanize";
// Add stealth plugin
puppeteer.use(StealthPlugin());
(async () => {
const browser = await puppeteer.launch({
headless: false,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
]
});
const page = await browser.newPage();
// Randomize user agent
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36'
];
await page.setUserAgent(userAgents[Math.floor(Math.random() * userAgents.length)]);
// Set other realistic properties
await page.evaluateOnNewDocument(() => {
// Remove webdriver property
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined,
});
// Add realistic plugins
Object.defineProperty(navigator, 'plugins', {
get: () => [1, 2, 3, 4, 5],
});
// Set realistic language
Object.defineProperty(navigator, 'language', {
get: () => 'en-US',
});
});
// Continue with your scraping...
})();
Using Proxies for Additional Protection
For serious scraping, you need residential proxies:
const proxyUrl = 'http://username:password@proxy-server:port';
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyUrl}`]
});
// Rotate proxies for each session
const proxies = [
'http://proxy1.example.com:8080',
'http://proxy2.example.com:8080',
'http://proxy3.example.com:8080'
];
const randomProxy = proxies[Math.floor(Math.random() * proxies.length)];
Step 5: Add Human-like Mouse Movements
Ghost cursor creates realistic mouse movements using Bezier curves. Here's how to use it:
Installing Ghost Cursor
import { createCursor } from "ghost-cursor";
import { typeInto } from "@forad/puppeteer-humanize";
async function humanizedInteraction(page) {
const cursor = createCursor(page);
// Move to element before typing
const searchBox = await page.$('#search');
// Human-like mouse movement to the element
await cursor.move(searchBox);
// Small random delay
await page.waitForTimeout(Math.random() * 300 + 100);
// Click with human-like behavior
await cursor.click();
// Type with humanization
await typeInto(searchBox, 'web scraping tutorials', {
delays: {
char: { min: 80, max: 150 }
}
});
// Move to search button and click
const searchButton = await page.$('button[type="submit"]');
await cursor.move(searchButton);
await page.waitForTimeout(Math.random() * 500 + 200);
await cursor.click();
}
Advanced Mouse Movement Patterns
// Simulate scrolling behavior
async function humanScroll(page) {
const scrolls = Math.floor(Math.random() * 3) + 2;
for (let i = 0; i < scrolls; i++) {
const distance = Math.floor(Math.random() * 300) + 100;
await page.evaluate((distance) => {
window.scrollBy({
top: distance,
behavior: 'smooth'
});
}, distance);
// Random pause between scrolls
await page.waitForTimeout(Math.random() * 2000 + 1000);
}
}
// Simulate reading behavior
async function simulateReading(page) {
// Random mouse movements while "reading"
const cursor = createCursor(page);
for (let i = 0; i < 3; i++) {
const x = Math.random() * 800 + 100;
const y = Math.random() * 600 + 100;
await cursor.move({ x, y });
await page.waitForTimeout(Math.random() * 1500 + 500);
}
}
Step 6: Implement Advanced Evasion Strategies
The best CAPTCHA is one that never appears. Here's how to stay invisible:
1. Randomize Everything
class HumanizedScraper {
constructor() {
this.timingProfiles = {
fast: { charMin: 50, charMax: 100, mistakeChance: 10 },
normal: { charMin: 80, charMax: 150, mistakeChance: 8 },
careful: { charMin: 100, charMax: 200, mistakeChance: 3 }
};
}
getRandomProfile() {
const profiles = Object.keys(this.timingProfiles);
return this.timingProfiles[profiles[Math.floor(Math.random() * profiles.length)]];
}
async typeWithProfile(element, text) {
const profile = this.getRandomProfile();
await typeInto(element, text, {
mistakes: { chance: profile.mistakeChance },
delays: {
char: { min: profile.charMin, max: profile.charMax }
}
});
}
}
2. Session Persistence
// Save cookies to maintain sessions
async function saveCookies(page, filepath) {
const cookies = await page.cookies();
fs.writeFileSync(filepath, JSON.stringify(cookies));
}
async function loadCookies(page, filepath) {
if (fs.existsSync(filepath)) {
const cookies = JSON.parse(fs.readFileSync(filepath));
await page.setCookie(...cookies);
}
}
// Use in your scraper
await loadCookies(page, 'session-cookies.json');
await page.goto('https://example.com');
// ... do scraping ...
await saveCookies(page, 'session-cookies.json');
3. Resource Optimization
Block requests that might trigger fingerprinting:
await page.setRequestInterception(true);
page.on('request', (request) => {
const resourceType = request.resourceType();
const url = request.url();
// Block unnecessary resources
if (['image', 'stylesheet', 'font', 'media'].includes(resourceType)) {
request.abort();
}
// Block tracking scripts
else if (url.includes('google-analytics') || url.includes('facebook.com/tr')) {
request.abort();
}
else {
request.continue();
}
});
4. Browser Context Rotation
async function createFreshContext(browser) {
// Create new incognito context
const context = await browser.createIncognitoBrowserContext();
const page = await context.newPage();
// Set random viewport
const viewports = [
{ width: 1920, height: 1080 },
{ width: 1366, height: 768 },
{ width: 1440, height: 900 }
];
await page.setViewport(viewports[Math.floor(Math.random() * viewports.length)]);
// Set random timezone
const timezones = ['America/New_York', 'Europe/London', 'Asia/Tokyo'];
await page.emulateTimezone(timezones[Math.floor(Math.random() * timezones.length)]);
return { context, page };
}
Step 7: Test and Monitor Your Scraper
Bot Detection Testing
Test your scraper against detection services:
async function testBotDetection(page) {
// Test sites that check for automation
const testSites = [
'https://bot.sannysoft.com',
'https://fingerprintjs.com/demo',
'https://browserleaks.com/javascript'
];
for (const site of testSites) {
await page.goto(site);
await page.waitForTimeout(3000);
// Take screenshot for manual review
await page.screenshot({
path: `bot-test-${site.replace(/[^a-z0-9]/gi, '-')}.png`,
fullPage: true
});
}
}
Performance Monitoring
class ScraperMonitor {
constructor() {
this.stats = {
requests: 0,
successful: 0,
blocked: 0,
captchas: 0
};
}
async checkForCaptcha(page) {
// Check for common CAPTCHA indicators
const captchaSelectors = [
'iframe[src*="recaptcha"]',
'div[class*="captcha"]',
'#px-captcha',
'div[class*="challenge"]'
];
for (const selector of captchaSelectors) {
const element = await page.$(selector);
if (element) {
this.stats.captchas++;
return true;
}
}
return false;
}
logStats() {
console.log('Scraper Statistics:');
console.log(`Total Requests: ${this.stats.requests}`);
console.log(`Successful: ${this.stats.successful}`);
console.log(`Blocked: ${this.stats.blocked}`);
console.log(`CAPTCHAs: ${this.stats.captchas}`);
console.log(`Success Rate: ${(this.stats.successful / this.stats.requests * 100).toFixed(2)}%`);
}
}
Complete Working Example
Here's everything together:
import puppeteer from 'puppeteer-extra';
import StealthPlugin from 'puppeteer-extra-plugin-stealth';
import { typeInto } from "@forad/puppeteer-humanize";
import { createCursor } from "ghost-cursor";
puppeteer.use(StealthPlugin());
async function scrapeWithHumanization() {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
try {
const page = await browser.newPage();
const cursor = createCursor(page);
// Set realistic viewport
await page.setViewport({ width: 1366, height: 768 });
// Navigate to target site
await page.goto('https://example.com/search', {
waitUntil: 'networkidle2'
});
// Wait like a human would
await page.waitForTimeout(Math.random() * 2000 + 1000);
// Find search box
const searchBox = await page.$('input[type="search"]');
// Move mouse to search box
await cursor.move(searchBox);
await page.waitForTimeout(Math.random() * 300 + 100);
// Click on search box
await cursor.click();
// Type search query with human-like behavior
await typeInto(searchBox, 'web scraping best practices', {
mistakes: {
chance: 6,
delay: { min: 100, max: 300 }
},
delays: {
char: { min: 80, max: 150 },
space: { chance: 70, min: 20, max: 80 }
}
});
// Wait before submitting
await page.waitForTimeout(Math.random() * 1000 + 500);
// Submit search
await page.keyboard.press('Enter');
// Wait for results
await page.waitForSelector('.search-results', { timeout: 10000 });
// Simulate reading behavior
await simulateReading(page, cursor);
// Extract results
const results = await page.evaluate(() => {
const items = document.querySelectorAll('.search-result');
return Array.from(items).map(item => ({
title: item.querySelector('h3')?.textContent,
url: item.querySelector('a')?.href,
description: item.querySelector('p')?.textContent
}));
});
console.log(`Found ${results.length} results`);
return results;
} catch (error) {
console.error('Scraping failed:', error);
} finally {
await browser.close();
}
}
async function simulateReading(page, cursor) {
// Random scrolls
const scrollCount = Math.floor(Math.random() * 3) + 2;
for (let i = 0; i < scrollCount; i++) {
await page.evaluate(() => {
window.scrollBy({
top: Math.random() * 300 + 100,
behavior: 'smooth'
});
});
// Random mouse movement
const x = Math.random() * 800 + 100;
const y = Math.random() * 600 + 100;
await cursor.move({ x, y });
// Reading pause
await page.waitForTimeout(Math.random() * 2000 + 1000);
}
}
// Run the scraper
scrapeWithHumanization();
Final Thoughts
puppeteer-humanize transforms your scraper from obvious bot to believable human. Mix in realistic typing, natural mouse movements, and proper browser setup, and you've got a scraper that's hard to detect.
Keep these principles in mind:
- Randomize everything: Every action should vary
- Slow down: Humans don't browse at light speed
- Test often: Always check your scraper passes detection tests
- Keep learning: Bot detection evolves—so should you
Yes, these techniques boost your success rate. No, they're not bulletproof. For critical scraping needs, consider specialized APIs that handle the anti-bot battle for you.