Cloak is a pure PHP HTTP client that provides low-level control over TLS fingerprints, allowing you to mimic real browser behavior and bypass anti-bot systems. In this guide, we'll show you how to leverage Cloak to evade TLS fingerprinting detection in your PHP web scraping projects.
Ever had your web scraper blocked even though you perfectly spoofed all the HTTP headers? You're not alone. The culprit is likely TLS fingerprinting - a sneaky technique that identifies automated requests before any application data is exchanged.
Here's the thing: When your PHP script connects to an HTTPS website, it leaves a unique TLS fingerprint during the handshake process. Anti-bot systems like Cloudflare analyze these fingerprints to distinguish between real browsers and automated scripts. Most PHP HTTP clients have distinctive fingerprints that scream "bot!"
That's where Cloak comes in. Unlike traditional PHP HTTP clients, Cloak gives you granular control over your TLS configuration, letting you impersonate real browser fingerprints and fly under the radar.
Step 1: Install Cloak via Composer
First, you'll need to install Cloak through Composer. Since this is a cutting-edge library for TLS control, make sure you have PHP 8.4 or higher installed.
composer require exe/cloak
Pro tip: If you're running an older PHP version, consider using Docker or a virtual environment to run PHP 8.4 alongside your existing setup. The TLS control features in Cloak require the latest PHP capabilities.
Step 2: Create a basic HTTP client instance
Let's start with a simple example to test if Cloak is working correctly:
<?php
require_once 'vendor/autoload.php';
use Cloak\Http\Client;
// Initialize the HTTP client
$client = new Client();
// Make a simple GET request
$response = $client->get('https://tls.peet.ws/api/all');
// Check the response
echo "Status: " . $response->getStatus() . PHP_EOL;
echo "Headers: " . json_encode($response->getHeaders()) . PHP_EOL;
echo "Body: " . substr($response->getBody(), 0, 200) . "..." . PHP_EOL;
This endpoint (tls.peet.ws) returns your TLS fingerprint details, so you can verify what fingerprint you're currently presenting.
Step 3: Configure TLS fingerprint settings
Now for the secret sauce - configuring Cloak to mimic specific browser fingerprints. Here's how to impersonate Chrome:
<?php
use Cloak\Http\Client;
use Cloak\TLS\BrowserProfile;
// Create a client with Chrome fingerprint
$client = new Client([
'tls_profile' => BrowserProfile::CHROME_120,
'http_version' => '2', // Use HTTP/2 like modern browsers
'headers_order' => [ // Match Chrome's header order
'host',
'connection',
'cache-control',
'sec-ch-ua',
'sec-ch-ua-mobile',
'sec-ch-ua-platform',
'upgrade-insecure-requests',
'user-agent',
'accept',
'sec-fetch-site',
'sec-fetch-mode',
'sec-fetch-user',
'sec-fetch-dest',
'accept-encoding',
'accept-language'
]
]);
// Set Chrome-like headers
$headers = [
'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language' => 'en-US,en;q=0.5',
'Accept-Encoding' => 'gzip, deflate, br',
'Sec-Ch-Ua' => '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'Sec-Ch-Ua-Mobile' => '?0',
'Sec-Ch-Ua-Platform' => '"Windows"',
'Sec-Fetch-Dest' => 'document',
'Sec-Fetch-Mode' => 'navigate',
'Sec-Fetch-Site' => 'none',
'Sec-Fetch-User' => '?1'
];
$response = $client->get('https://example.com', $headers);
Common pitfall: Don't just copy random cipher suites from the internet. Each browser version has specific cipher suite orders and supported extensions. Use predefined browser profiles or capture real browser traffic with Wireshark to get accurate configurations.
Step 4: Implement request rotation and fingerprint randomization
Static fingerprints can still get you blocked. Here's a more advanced approach that rotates between different browser profiles:
<?php
class StealthClient {
private $profiles = [
'chrome' => [
'profile' => BrowserProfile::CHROME_120,
'ua' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
],
'firefox' => [
'profile' => BrowserProfile::FIREFOX_121,
'ua' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0'
],
'edge' => [
'profile' => BrowserProfile::EDGE_120,
'ua' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 Edg/120.0.0.0'
]
];
private $proxies = [];
private $currentProfileIndex = 0;
public function __construct(array $proxies = []) {
$this->proxies = $proxies;
}
public function get($url) {
// Rotate browser profile
$profileKeys = array_keys($this->profiles);
$currentProfile = $this->profiles[$profileKeys[$this->currentProfileIndex]];
// Create client with rotated profile
$config = [
'tls_profile' => $currentProfile['profile'],
'http_version' => '2'
];
// Add proxy if available
if (!empty($this->proxies)) {
$proxy = $this->proxies[array_rand($this->proxies)];
$config['proxy'] = $proxy;
}
$client = new Client($config);
// Build headers
$headers = [
'User-Agent' => $currentProfile['ua'],
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => $this->getRandomAcceptLanguage(),
'Accept-Encoding' => 'gzip, deflate, br'
];
// Rotate to next profile for next request
$this->currentProfileIndex = ($this->currentProfileIndex + 1) % count($profileKeys);
// Add random delay to mimic human behavior
usleep(rand(500000, 2000000)); // 0.5-2 seconds
return $client->get($url, $headers);
}
private function getRandomAcceptLanguage() {
$languages = [
'en-US,en;q=0.9',
'en-GB,en;q=0.9',
'en-US,en;q=0.9,es;q=0.8',
'en-US,en;q=0.9,fr;q=0.8',
'en-US,en;q=0.9,de;q=0.8'
];
return $languages[array_rand($languages)];
}
}
// Usage
$scraper = new StealthClient([
'http://proxy1.com:8080',
'http://proxy2.com:8080'
]);
$response = $scraper->get('https://example.com/data');
Step 5: Handle advanced anti-bot systems
For sites with sophisticated protection (Cloudflare, PerimeterX, DataDome), you'll need additional techniques:
<?php
class AdvancedScraper {
private $client;
private $cookieJar = [];
public function __construct() {
// Initialize with advanced settings
$this->client = new Client([
'tls_profile' => BrowserProfile::CHROME_120,
'http_version' => '2',
'enable_push' => true, // Enable HTTP/2 server push
'window_size' => 65535, // Match browser window size
'header_table_size' => 65536,
'enable_connect_protocol' => true,
'initial_stream_window_size' => 6291456,
'max_header_list_size' => 262144
]);
}
public function scrapeWithJSChallenge($url) {
// First request - expect challenge
$response = $this->makeRequest($url);
if ($response->getStatus() === 403 || $response->getStatus() === 503) {
// Check if it's a JS challenge
if (strpos($response->getBody(), 'challenge-platform') !== false) {
// Parse challenge (simplified example)
preg_match('/name="jschl_vc" value="([^"]+)"/', $response->getBody(), $vcMatch);
preg_match('/name="pass" value="([^"]+)"/', $response->getBody(), $passMatch);
if ($vcMatch && $passMatch) {
// Wait for challenge delay
sleep(4);
// Submit challenge solution
$challengeUrl = parse_url($url, PHP_URL_SCHEME) . '://' .
parse_url($url, PHP_URL_HOST) .
'/cdn-cgi/l/chk_jschl';
$challengeData = [
'jschl_vc' => $vcMatch[1],
'pass' => $passMatch[1],
'jschl_answer' => $this->solveChallenge($response->getBody())
];
$challengeResponse = $this->client->post($challengeUrl, $challengeData, [
'Referer' => $url,
'Origin' => parse_url($url, PHP_URL_SCHEME) . '://' . parse_url($url, PHP_URL_HOST)
]);
// Extract cookies from challenge response
$this->extractCookies($challengeResponse);
// Retry original request with cookies
return $this->makeRequest($url);
}
}
}
return $response;
}
private function makeRequest($url) {
$headers = [
'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Cookie' => $this->buildCookieHeader()
];
return $this->client->get($url, $headers);
}
private function solveChallenge($html) {
// Implement challenge solving logic
// This is a simplified placeholder
return '12345';
}
private function extractCookies($response) {
$setCookieHeaders = $response->getHeader('set-cookie');
if (is_array($setCookieHeaders)) {
foreach ($setCookieHeaders as $cookie) {
preg_match('/^([^=]+)=([^;]+)/', $cookie, $matches);
if ($matches) {
$this->cookieJar[$matches[1]] = $matches[2];
}
}
}
}
private function buildCookieHeader() {
$cookies = [];
foreach ($this->cookieJar as $name => $value) {
$cookies[] = "$name=$value";
}
return implode('; ', $cookies);
}
}
Alternative approach: If you're dealing with extremely sophisticated protection, consider using a headless browser approach with Puppeteer or Playwright PHP bindings instead. While Cloak excels at TLS fingerprinting, some sites also check JavaScript execution and DOM behavior.
Final thoughts
Cloak provides PHP developers with unprecedented control over TLS fingerprints, making it possible to scrape sites that would normally block traditional HTTP clients. The key to success is understanding that modern anti-bot systems look at multiple signals:
- TLS fingerprint (cipher suites, extensions, order)
- HTTP/2 settings and behavior
- Header order and values
- Request timing patterns
- IP reputation
By carefully configuring all these aspects with Cloak, you can create scrapers that are virtually indistinguishable from real browsers.