How to Set Up Airtop for AI Web Automation

Airtop is a cloud browser automation platform that lets you control browsers through natural language commands and handle complex authentication like OAuth, 2FA, and CAPTCHAs automatically.

In this guide, we'll walk through setting up Airtop from scratch, including proxy configuration for bypassing anti-bot detection and handling authenticated sessions.

What You'll Learn

In this guide, you'll discover how to:

  • Set up Airtop with proper API authentication
  • Configure residential proxies for stealth scraping
  • Handle complex authentication flows (OAuth, 2FA)
  • Build automated scrapers with natural language commands
  • Bypass anti-bot measures effectively

Step 1: Get Your API Key and Initial Setup

First things first - you need an Airtop account and API key to get started.

Setting Up Your Account

  1. Head to portal.airtop.ai and create a free account
  2. Navigate to the API Keys section in your dashboard
  3. Click "+ Create new key" and give it a descriptive name
  4. Copy the generated key immediately (you won't see it again)

Installing the SDK

Airtop provides SDKs for both TypeScript/Node.js and Python. Pick your weapon:

For Node.js/TypeScript:

npm install @airtop/sdk
# or with yarn
yarn add @airtop/sdk

For Python:

pip install airtop

Environment Configuration

Create a .env file in your project root:

AIRTOP_API_KEY=your_api_key_here

Step 2: Initialize Your First Browser Session

Let's create a basic browser session to test everything's working:

TypeScript Example:

import { AirtopClient } from "@airtop/sdk";
import * as dotenv from 'dotenv';

dotenv.config();

const client = new AirtopClient({
    apiKey: process.env.AIRTOP_API_KEY
});

async function createSession() {
    try {
        // Create a browser session
        const session = await client.sessions.create({
            configuration: {
                // Session lasts 10 minutes by default
                timeoutMinutes: 10
            }
        });
        
        console.log(`Session created: ${session.id}`);
        
        // Create a window and navigate
        const window = await client.windows.create(session.id, {
            url: "https://example.com"
        });
        
        console.log(`Window created: ${window.id}`);
        
        return { session, window };
    } catch (error) {
        console.error("Failed to create session:", error);
    }
}

createSession();

Python Example:

import os
from airtop import Airtop
from dotenv import load_dotenv

load_dotenv()

client = Airtop(
    api_key=os.getenv("AIRTOP_API_KEY")
)

async def create_session():
    try:
        # Create a browser session
        session = await client.sessions.create(
            configuration={
                "timeoutMinutes": 10
            }
        )
        
        print(f"Session created: {session.id}")
        
        # Create a window
        window = await client.windows.create(
            session_id=session.id,
            url="https://example.com"
        )
        
        print(f"Window created: {window.id}")
        
        return session, window
        
    except Exception as e:
        print(f"Failed to create session: {e}")

Step 3: Configure Residential Proxies

Here's where things get interesting. Airtop has built-in residential proxy support with over 100 million IPs from 100+ countries. This is crucial for bypassing anti-bot detection.

Using Airtop's Integrated Proxy

const sessionWithProxy = await client.sessions.create({
    configuration: {
        proxy: {
            type: "residential",
            country: "US",  // ISO 3166-1 format
            sticky: true     // Keep same IP for 30 minutes
        }
    }
});

Using Your Own Proxy

If you already have a proxy provider (like Bright Data, Oxylabs, or SmartProxy), you can bring your own:

const sessionWithCustomProxy = await client.sessions.create({
    configuration: {
        proxy: {
            server: "http://proxy.example.com:8080",
            username: "your_username",
            password: "your_password"
        }
    }
});

Domain-Specific Proxy Routing

This is a neat trick - you can route only specific domains through the proxy while letting others connect directly:

const sessionWithSelectiveProxy = await client.sessions.create({
    configuration: {
        proxy: [
            {
                domainPattern: "*.wikipedia.org",
                relay: {
                    type: "residential",
                    country: "UK"
                }
            },
            {
                domainPattern: "*",  // All other domains
                relay: null          // No proxy
            }
        ]
    }
});

Step 4: Handle Authentication Like a Pro

One of Airtop's killer features is handling complex authentication flows automatically. Here's how to scrape data behind login walls:

async function scrapeWithAuth() {
    const session = await client.sessions.create();
    
    // Create a window
    const window = await client.windows.create(session.id, {
        url: "https://linkedin.com"
    });
    
    // Create a live view for manual login
    const liveView = await client.windows.createLiveView(
        session.id, 
        window.id
    );
    
    console.log(`Login here: ${liveView.url}`);
    console.log("Complete the login process (including 2FA if needed)");
    
    // Wait for user to complete login
    await new Promise(resolve => {
        setTimeout(resolve, 60000); // Wait 60 seconds
    });
    
    // Save the authenticated profile
    const profile = await client.profiles.create({
        sessionId: session.id,
        name: "linkedin-authenticated"
    });
    
    // Now you can reuse this profile for future sessions
    const newSession = await client.sessions.create({
        profileId: profile.id
    });
    
    // Extract data using natural language
    const data = await client.windows.pageQuery(
        newSession.id,
        window.id,
        {
            prompt: "Extract all job postings with company names, titles, and locations",
            configuration: {
                outputSchema: {
                    type: "array",
                    items: {
                        type: "object",
                        properties: {
                            company: { type: "string" },
                            title: { type: "string" },
                            location: { type: "string" }
                        }
                    }
                }
            }
        }
    );
    
    return data;
}

Step 5: Leverage Natural Language Commands

Instead of writing complex selectors, use Airtop's AI to interact with pages naturally:

async function naturalLanguageAutomation() {
    const session = await client.sessions.create({
        configuration: {
            proxy: { type: "residential", country: "US" }
        }
    });
    
    const window = await client.windows.create(session.id, {
        url: "https://producthunt.com"
    });
    
    // Extract structured data with a simple prompt
    const products = await client.windows.pageQuery(
        session.id,
        window.id,
        {
            prompt: `Find all new product launches from today. 
                     For each product, extract:
                     - Product name
                     - Description
                     - Vote count
                     - Maker name
                     Ignore sponsored listings`,
            configuration: {
                followPagination: true,  // Automatically handle pagination
                maxPages: 5
            }
        }
    );
    
    // Interact with the page
    await client.windows.act(session.id, window.id, {
        action: "Click on the first product that has more than 100 votes"
    });
    
    // Take a screenshot
    const screenshot = await client.windows.screenshot(
        session.id, 
        window.id
    );
    
    return products;
}

Step 6: Build a Production-Ready Scraper

Let's put it all together with a real-world example - monitoring competitor pricing:

import { AirtopClient } from "@airtop/sdk";
import * as fs from 'fs';

class CompetitorPriceMonitor {
    private client: AirtopClient;
    private profileId?: string;
    
    constructor(apiKey: string) {
        this.client = new AirtopClient({ apiKey });
    }
    
    async initialize() {
        // Check if we have a saved profile
        const profilePath = './competitor-profile.json';
        if (fs.existsSync(profilePath)) {
            const profile = JSON.parse(fs.readFileSync(profilePath, 'utf-8'));
            this.profileId = profile.id;
        }
    }
    
    async monitorPricing(competitorUrl: string) {
        try {
            // Create session with proxy rotation
            const session = await this.client.sessions.create({
                profileId: this.profileId,
                configuration: {
                    proxy: {
                        type: "residential",
                        country: "US",
                        sticky: false  // Rotate IP for each request
                    }
                }
            });
            
            const window = await this.client.windows.create(session.id, {
                url: competitorUrl
            });
            
            // Wait for page to load
            await this.client.windows.waitForLoad(session.id, window.id);
            
            // Extract pricing data
            const pricingData = await this.client.windows.pageQuery(
                session.id,
                window.id,
                {
                    prompt: `Extract all pricing plans with:
                            - Plan name
                            - Monthly price
                            - Annual price
                            - Top 3 features
                            - Any discounts or promotions`,
                    configuration: {
                        outputSchema: {
                            type: "object",
                            properties: {
                                plans: {
                                    type: "array",
                                    items: {
                                        type: "object",
                                        properties: {
                                            name: { type: "string" },
                                            monthlyPrice: { type: "number" },
                                            annualPrice: { type: "number" },
                                            features: {
                                                type: "array",
                                                items: { type: "string" }
                                            },
                                            discount: { type: "string" }
                                        }
                                    }
                                },
                                lastUpdated: { type: "string" }
                            }
                        }
                    }
                }
            );
            
            // Compare with previous data
            const previousData = this.loadPreviousData(competitorUrl);
            const changes = this.detectChanges(previousData, pricingData);
            
            if (changes.length > 0) {
                await this.notifyChanges(competitorUrl, changes);
            }
            
            // Save current data
            this.savePricingData(competitorUrl, pricingData);
            
            // Clean up
            await this.client.sessions.terminate(session.id);
            
            return pricingData;
            
        } catch (error) {
            console.error(`Error monitoring ${competitorUrl}:`, error);
            throw error;
        }
    }
    
    private detectChanges(previous: any, current: any): any[] {
        // Implementation for change detection
        const changes = [];
        // Compare pricing, features, etc.
        return changes;
    }
    
    private async notifyChanges(url: string, changes: any[]) {
        // Send notifications (email, Slack, etc.)
        console.log(`Price changes detected for ${url}:`, changes);
    }
    
    private loadPreviousData(url: string): any {
        // Load from database or file
        return null;
    }
    
    private savePricingData(url: string, data: any) {
        // Save to database or file
        fs.writeFileSync(
            `./pricing-data/${url.replace(/[^a-z0-9]/gi, '_')}.json`,
            JSON.stringify(data, null, 2)
        );
    }
}

// Usage
async function main() {
    const monitor = new CompetitorPriceMonitor(process.env.AIRTOP_API_KEY!);
    await monitor.initialize();
    
    const competitors = [
        "https://competitor1.com/pricing",
        "https://competitor2.com/plans",
        "https://competitor3.com/pricing"
    ];
    
    for (const url of competitors) {
        await monitor.monitorPricing(url);
        // Add delay to avoid rate limiting
        await new Promise(resolve => setTimeout(resolve, 5000));
    }
}

main().catch(console.error);

Advanced Tips and Tricks

1. Bypass Cloudflare and Other Anti-Bot Systems

const stealthSession = await client.sessions.create({
    configuration: {
        proxy: {
            type: "residential",
            country: "US"
        },
        viewport: {
            width: 1920,
            height: 1080
        },
        userAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
        // Airtop automatically handles fingerprinting
        stealth: true
    }
});

2. Handle Dynamic Content and Infinite Scroll

const scrollAndExtract = async (sessionId: string, windowId: string) => {
    // Scroll to load dynamic content
    await client.windows.act(sessionId, windowId, {
        action: "Scroll to the bottom of the page slowly over 5 seconds"
    });
    
    // Extract data after scrolling
    const data = await client.windows.pageQuery(sessionId, windowId, {
        prompt: "Extract all product cards that are now visible",
        configuration: {
            waitForStable: true  // Wait for content to stop changing
        }
    });
    
    return data;
};

3. Smart CAPTCHA Handling

While Airtop handles many CAPTCHAs automatically, for complex ones you can use the live view:

async function handleCaptcha(session: any, window: any) {
    const captchaDetected = await client.windows.pageQuery(
        session.id,
        window.id,
        {
            prompt: "Is there a CAPTCHA on this page? Return true or false"
        }
    );
    
    if (captchaDetected) {
        const liveView = await client.windows.createLiveView(
            session.id,
            window.id
        );
        
        console.log(`Manual intervention needed: ${liveView.url}`);
        // Wait for human to solve CAPTCHA
        await new Promise(resolve => setTimeout(resolve, 30000));
    }
}

Pricing and Alternatives

Airtop Pricing Tiers

  • Free Plan: 5,000 credits, 1 simultaneous session
  • Starter: $29/month, 3 simultaneous sessions, integrated proxy
  • Professional: $89/month, 30 simultaneous sessions, custom proxy support
  • Enterprise: $380+/month, 100+ sessions, SOC 2 compliance, dedicated support

When to Use Airtop vs Alternatives

Choose Airtop when:

  • You need to handle complex authentication (OAuth, 2FA)
  • Natural language commands appeal to you
  • You're scraping JavaScript-heavy sites with anti-bot measures
  • You need reliable proxy rotation

Consider Playwright/Puppeteer when:

  • You need fine-grained control over browser behavior
  • You're building complex test suites
  • Budget is extremely tight (both are free)
  • You have existing infrastructure for proxy management

Consider Selenium when:

  • You need support for legacy browsers
  • Your team uses languages like Java or C#
  • You have existing Selenium infrastructure

Common Pitfalls to Avoid

  1. Not rotating proxies enough: Some sites track IP patterns. Use sticky: false for aggressive sites.
  2. Not saving authenticated profiles: Authentication is expensive. Always save profiles for reuse.
  3. Over-relying on natural language: Sometimes explicit selectors are more reliable for production systems.

Ignoring rate limits: Even with proxies, respect rate limits. Add delays between requests:

await new Promise(resolve => setTimeout(resolve, Math.random() * 3000 + 2000));

Next Steps

Now that you have Airtop configured, consider these advanced implementations:

  • Build a distributed scraping system with queue management
  • Integrate with data pipelines using Apache Airflow
  • Create a monitoring dashboard with real-time alerts
  • Implement automatic proxy health checking and rotation

Remember, Airtop shines when you need to interact with complex, modern web applications that would typically require human intervention. The combination of cloud browsers, natural language processing, and integrated proxy support makes it a powerful tool for scenarios where traditional scraping fails.

Marius Bernard

Marius Bernard

Marius Bernard is a Product Advisor, Technical SEO, & Brand Ambassador at Roundproxies. He was the lead author for the SEO chapter of the 2024 Web and a reviewer for the 2023 SEO chapter.