Ever found yourself blocked from accessing websites due to network restrictions at work or school? Or maybe you're a developer trying to build a web scraping application that keeps getting rate-limited? Look no further – Node-Unblocker is your solution.
Node-Unblocker was originally a web proxy for evading internet censorship, similar to CGIproxy / PHProxy / Glype but written in node.js. It's since morphed into a general-purpose library for proxying and rewriting remote webpages. What makes it special is its lightning-fast performance – all data is processed and relayed to the client on the fly without unnecessary buffering, making unblocker one of the fastest web proxies available.
I've personally used Node-Unblocker to build proxy servers for various projects, from simple website unblockers to sophisticated web scraping applications. In this guide, I'll walk you through everything you need to know to get started.
What You'll Learn:
- How to install and set up Node-Unblocker
- Configure custom middleware for advanced functionality
- Deploy your proxy server to the cloud
- Implement best practices for web scraping
- Troubleshoot common issues
Why You Can Trust This Guide
Problem: Traditional proxy solutions are often slow, complicated to set up, or expensive to maintain. Many developers struggle with rate limiting, geo-restrictions, and anti-bot measures when building web applications.
Solution: Node-Unblocker provides a fast, customizable, and open-source proxy solution that can be deployed quickly and scaled easily.
Proof: I've successfully deployed Node-Unblocker servers that handle thousands of requests daily for web scraping projects. Node-unblocker can unlock rate-limiting doors of any website. No matter what website you are trying to scrape, node-unblocker will add a layer to pass all the incoming requests through a proxy.
Step 1: Install Node.js and Create Your Project
First, ensure you have Node.js installed on your system. If not, download it from nodejs.org. Once installed, create a new directory for your project:
mkdir my-proxy-server
cd my-proxy-server
npm init -y
This initializes a new Node.js project with default settings. The package.json
file created will manage your project dependencies.
Pro tip: Make sure you've got Node.js and npm installed on your system. You can do that by following the official guide from the Node.js website or with a version management tool like nvm.
Step 2: Install Required Dependencies
Now let's install the necessary packages. Express will allow you to create a web server quickly, while unblocker is the npm package name housing node-unblocker.
npm install express unblocker
These two packages are all you need to get started:
- Express: A minimal web framework for Node.js
- Unblocker: The proxy library itself
Step 3: Create Your First Proxy Server
Create a new file called server.js
and add the following code:
const express = require('express');
const Unblocker = require('unblocker');
const app = express();
const unblocker = new Unblocker({ prefix: '/proxy/' });
// This must be one of the first app.use() calls
app.use(unblocker);
// Create a simple homepage
app.get('/', function(req, res) {
res.send(`
<h1>My Proxy Server</h1>
<p>To use the proxy, append the URL after /proxy/</p>
<p>Example: /proxy/https://example.com</p>
`);
});
// Start the server
const port = process.env.PORT || 8080;
app.listen(port).on('upgrade', unblocker.onUpgrade);
console.log(`Server running on port ${port}`);
Run your server:
node server.js
Visit http://localhost:8080
to see your proxy in action!
Step 4: Configure Custom Middleware
Unblocker "middleware" are small functions that allow you to inspect and modify requests and responses. Here's how to add custom functionality:
// Custom middleware to set user agent
function setUserAgent(data) {
data.headers["user-agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36";
}
// Custom middleware to add authentication
function addAuth(data) {
if (data.url.includes('api.example.com')) {
data.headers["authorization"] = "Bearer YOUR_TOKEN";
}
}
// Configure unblocker with middleware
const unblocker = new Unblocker({
prefix: '/proxy/',
requestMiddleware: [
setUserAgent,
addAuth
]
});
Common middleware use cases:
- Rotating user agents
- Adding authentication headers
- Filtering requests
- Logging and monitoring
Step 5: Add Advanced Features
Let's enhance our proxy with more sophisticated features:
const unblocker = new Unblocker({
prefix: '/proxy/',
host: null, // Automatically determine from request
requestMiddleware: [],
responseMiddleware: [],
standardMiddleware: true, // Use built-in middleware
clientScripts: true, // Inject scripts for WebSocket support
processContentTypes: [
'text/html',
'application/xml+xhtml',
'application/xhtml+xml',
'text/css'
]
});
// Add response modification
function modifyResponse(data) {
if (data.contentType === 'text/html') {
// Inject custom CSS or JavaScript
const Transform = require('stream').Transform;
const newStream = new Transform({
transform(chunk, encoding, callback) {
let content = chunk.toString();
// Modify content here
callback(null, content);
}
});
data.stream = data.stream.pipe(newStream);
}
}
unblocker.responseMiddleware.push(modifyResponse);
Step 6: Deploy to the Cloud
Popular choices include Render with its generous free tier, DigitalOcean's affordable droplets starting at $4/month, Railway's developer-friendly platform starting at $5/month, and Heroku's reliable infrastructure.
Deploying to Heroku:
- Update package.json:
{
"name": "my-proxy-server",
"version": "1.0.0",
"engines": {
"node": "18.x"
},
"scripts": {
"start": "node server.js"
}
}
- Create a .gitignore file:
node_modules/
.env
- Deploy to Heroku:
git init
git add .
git commit -m "Initial commit"
heroku create your-app-name
git push heroku main
Important: It's important to be aware of a hosting platform's Acceptable Use Policy (AUP) before putting any scraping or proxy software on it.
Step 7: Test and Monitor Your Proxy
Once deployed, test your proxy thoroughly:
// Test script using axios
const axios = require('axios');
async function testProxy() {
try {
const response = await axios.get('http://your-proxy.herokuapp.com/proxy/https://httpbin.org/ip');
console.log('Proxy response:', response.data);
} catch (error) {
console.error('Error:', error.message);
}
}
testProxy();
Best Practices for Production:
- Implement rate limiting to prevent abuse
- Add authentication for private use
- Monitor performance and errors
- Rotate IP addresses for web scraping
- Respect robots.txt and terms of service
Common Issues and Solutions
Issue: Complex websites don't work properly
Popular but complex websites like Discord, Twitter, or YouTube won't work correctly. This is because they use advanced JavaScript features and OAuth authentication.
Solution: For these sites, consider using headless browsers like Puppeteer or Playwright instead.
Issue: Getting blocked or rate-limited
Even with a proxy, websites can detect and block scrapers.
Solution: Implement request throttling, rotate user agents, and use multiple proxy servers.
Issue: SSL/HTTPS errors
Some websites have strict SSL requirements.
Solution: Configure the httpsAgent
option in your Unblocker configuration with proper SSL settings.
Remember: Always check for a robots.txt file of the target website and respect its guidelines. Ethical web scraping ensures the sustainability of your projects and maintains good relationships with website owners.
Final Thoughts
Node-Unblocker is a powerful tool for bypassing restrictions and building web scraping applications. While it's easy to set up, remember that it works well only for simple sites and it fails for advanced tasks. For complex websites with advanced anti-bot measures, you might need additional tools or services.
The key to success with Node-Unblocker is understanding its capabilities and limitations. Start with simple use cases, then gradually add complexity as you become more comfortable with the library.