Cloudscraper is a Python tool that helps you bypass certain anti-bot protection challenges. It can be especially handy if you want to scrape web pages for data or test how a site performs under Cloudflare alerts.

blocked_by_cloudflare

When used correctly and ethically, Cloudscraper can save you time. And since it’s 2025, things have moved on a bit from older scripts and dependencies.

In this guide, we’ll jump into the basics. By the end, you should be able to install, configure, and make your first Cloudscraper request using Python.

Let’s get started.


What Is Cloudscraper?

Cloudscraper is a Python library. It acts like a standard requests library (the popular Python HTTP client) but adds solutions for some Cloudflare (and other anti-bot) challenges.

In simple terms:

  • It looks like a normal user’s browser, with the correct headers and behavior patterns.
  • It can handle Cloudflare’s JavaScript challenges automatically in most situations.
  • It saves time when you need to gather data from a site protected by Cloudflare.

It’s worth saying:

always scrape responsibly. You must obey terms of service and the law. We use Cloudscraper for legitimate tasks, like retrieving a site’s data we legally own or verifying that our own Cloudflare setup is working as expected.

Prerequisites

Before diving into the code, you’ll need:

• Python 3.8 or above. Python 3.11+ is recommended for better performance in 2025.
• Pip and a virtual environment, if you want your dependencies tidy
• Some familiarity with HTTP requests in Python (requests, urllib, etc.) will help.

Let’s get coding.


Step 1: Installation

We’ll begin by installing the package. Fire up your terminal.

If you’re using pip:

pip install cloudscraper

Alternatively, if you prefer Poetry or pipenv, you can add “cloudscraper” to your dependencies:

poetry add cloudscraper

Done. That’s the simplest part.


Step 2: Creating a Basic Session

In older times, we used Python requests like this:

import requests

response = requests.get("https://example.com")
print(response.text)

But a typical Cloudflare-protected site might throw you a challenge. Let’s see how Cloudscraper changes that:

import cloudscraper

scraper = cloudscraper.create_scraper()
response = scraper.get("https://example.com")
print(response.text)

One short snippet. But there’s more going on behind the scenes.

With create_scraper(), Cloudscraper:
• Configures the right session headers.
• Handles any JavaScript or reCAPTCHA-based challenges if they’re not too strict.
• Returns a response that you can parse, just like the classic requests.


Step 3: Customizing Headers and Browser Emulation

Sometimes you want to look like a specific browser. Or alter user-agent strings. Let’s do that:

import cloudscraper

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) " 
                  "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.4896.75 "
                  "Safari/537.36"
}

scraper = cloudscraper.create_scraper(browser={
    'browser': 'chrome',
    'platform': 'windows',
    'mobile': False
})

# Update the session headers:
scraper.headers.update(headers)

response = scraper.get("https://example.com/special-page")
print(response.status_code)
Here’s what we did:
• We passed some emulation options for a Chrome/Windows environment.
• We replaced the default user-agent with a custom string.
• We tested a GET request.

If you see a 200 status code, success.


Step 4: Handling Form Submissions

Many sites rely on form submissions. Let’s say you have a login form or advanced parameters. You can still do it with Cloudscraper:

payload = {
    "username": "my_user",
    "password": "my_password"
}

login_url = "https://example.com/login"

login_response = scraper.post(login_url, data=payload)

# Now let's see if there was a redirect or cookies
if login_response.ok:
    # Possibly we got a session cookie
    content = login_response.text
    print("Logged in!")
else:
    print("Login failed with status:", login_response.status_code)

At its core, Cloudscraper behaves like requests. So your code looks nearly identical—just behind the scenes, it ensures your requests can pass Cloudflare checks (where possible).


Step 5: Handling Retries and Exceptions

Sometimes, Cloudflare challenges are too strict. Or the site’s rate-limiting you. Plan for that with exception handling:

import time
import cloudscraper
from requests.exceptions import HTTPError

scraper = cloudscraper.create_scraper()

url_list = [
    "https://example-1.com", 
    "https://example-2.com",
    "https://example-3.com"
]

for url in url_list:
    for attempt in range(3):
        try:
            resp = scraper.get(url, timeout=10)
            resp.raise_for_status()
            # If we get this far, we can break out of the retry loop
            print(f"SUCCESS: {url}")
            break
        except HTTPError as e:
            print(f"HTTPError on {url}: {e}")
            time.sleep(2)  # small backoff
        except Exception as ex:
            print(f"Other error on {url}: {ex}")
            time.sleep(2)

Elsewhere, you might prefer a library like Tenacity or backoff for advanced retry logic. But this snippet is enough to show you how you might handle repeated attempts.


Step 6: Putting It All Together

Let’s combine a few steps into a single script. This script:
1. Initializes a Cloudscraper session with custom options.
2. Logs into a fictitious form.
3. Loops through a list of pages to gather data.
4. Saves it to a local file.
#!/usr/bin/env python
# my_cloudscraper_script.py

import cloudscraper
import logging

def main():
    # Set up logging
    logging.basicConfig(
        level=logging.INFO, 
        format='%(asctime)s [%(levelname)s] %(message)s'
    )

    # Create scraper with a custom browser
    scraper = cloudscraper.create_scraper(
        browser={
            'browser': 'chrome',
            'platform': 'windows',
            'mobile': False
        }
    )

    # Suppose we do a login POST
    login_data = {"username": "my_user", "password": "my_pass"}
    login_endpoint = "https://example.com/login"
    
    response = scraper.post(login_endpoint, data=login_data)
    
    if not response.ok:
        logging.error("Login failed. Status code: %d", response.status_code)
        return

    # Suppose we have endpoints to scrape
    endpoints = ["/account", "/account/settings", "/data/export"]

    for ep in endpoints:
        full_url = f"https://example.com{ep}"
        ep_resp = scraper.get(full_url)
        if ep_resp.ok:
            logging.info("Fetched %s (length: %d)", ep, len(ep_resp.text))
            # Save content to a file
            with open(f"output_{ep.strip('/')}.html", "w", encoding="utf8") as f:
                f.write(ep_resp.text)
        else:
            logging.warning("Could not fetch %s. Status: %d", ep, ep_resp.status_code)

if __name__ == "__main__":
    main()

With a bit of expansion, you can easily integrate this approach into your data pipelines, your web-scraping logic, or your QA validation tasks.


Tips for 2025

Cloudflare continues to evolve. So do other anti-bot systems. In 2025, you might see.

• More advanced JavaScript checking.
• Browser fingerprinting that sees beyond your user-agent.
• Frequent changes in challenge-solving rules.

If Cloudscraper fails on a particular site, you may need to:

• Use real browsers (tools like Playwright or Puppeteer with stealth plugins).
• Slow down your request rate.
• Double-check site permissions and TOS.


Final Thoughts

Cloudscraper remains a solid choice for basic Cloudflare challenges. It’s light, easy to use, and integrates seamlessly with existing Python code.

Just remember that your usage must be legitimate. Always respect intellectual property rights, robots.txt guidelines, and site-specific terms. Overstepping these can lead to blocked IPs, or in worst cases, legal consequences.

But if you’re authorized to test your site’s Cloudflare security, or gather crucial data from your own web properties, consider Cloudscraper as your go-to tool. Its updates each year seem to keep pace with Cloudflare’s new hurdles.

Go forth and scrape responsibly. May your HTTP 200s be plentiful, and your Cloudflare blocks be minimal.

Marius Bernard

Marius Bernard

Marius Bernard is a Product Advisor, Technical SEO, & Brand Ambassador at Roundproxies. He was the lead author for the SEO chapter of the 2024 Web and a reviewer for the 2023 SEO chapter.