smartprog-puppeteer-page-proxy
v1.3.2
Published
Additional Node.js module to use with 'puppeteer' for setting proxies per page basis.
Downloads
6
Maintainers
Readme
smartprog-puppeteer-page-proxy
Additional Node.js module to use with puppeteer for setting proxies per page basis.
Forwards intercepted requests from the browser to Node.js where it redoes the requests through a proxy and then returns the response to the browser.
Features
- Proxy per page and proxy per request
- Supports http, https, socks4 and socks5 proxies
- Supports authentication
- Handles cookies
Installation
npm i smartprog-puppeteer-page-proxy
Usage
Importing:
const useProxy = require('smartprog-puppeteer-page-proxy');
Proxy per page:
await useProxy(page, 'http://127.0.0.1:80');
To remove proxy, omit or pass in falsy value (e.g null
):
await useProxy(page, null);
Proxy per request:
await page.setRequestInterception(true);
page.on('request', async request => {
await useProxy(request, 'https://127.0.0.1:443');
});
The request object itself is passed as the first argument. The individual request will be tunneled through the specified proxy.
Using it together with other interception methods:
await page.setRequestInterception(true);
page.on('request', async request => {
if (request.resourceType() === 'image') {
request.abort();
} else {
await useProxy(request, 'socks4://127.0.0.1:1080');
}
});
Overriding requests:
await page.setRequestInterception(true);
page.on('request', async request => {
await useProxy(request, {
proxy: 'socks5://127.0.0.1:1080',
url: 'https://example.com',
method: 'POST',
postData: '404',
headers: {
accept: 'text/html'
}
});
});
NOTE: It's necessary to set Page.setRequestInterception() to true when setting proxies per request, otherwise the function will fail.
Authenticating:
const proxy = 'https://user:pass@host:port';
IP lookup:
// 1. Waits until done, 'then' continues
const data = await useProxy.lookup(page1);
console.log(data.ip);
// 2. Executes and 'comes back' once done
useProxy.lookup(page2).then(data => {
console.log(data.ip);
});
In case of any CORS errors, use --disable-web-security
launch flag:
const browser = await puppeteer.launch({
args: ['--disable-web-security']
});
FAQ
How does this module work?
It takes over the task of requesting content from the browser to do it internally via a requests library instead. Requests that are normally made by the browser, are thus made by Node. The IP's are changed by routing the requests through the specified proxy servers using *-proxy-agent's. When Node gets a response back from the server, it's forwarded to the browser for completion/rendering.
Why am I getting "Request is already handled!"?
This happens when there is an attempt to handle the same request more than once. An intercepted request is handled by either HTTPRequest.abort(), HTTPRequest.continue() or HTTPRequest.respond() methods. Each of these methods 'send' the request to its destination. A request that has already reached its destination cannot be intercepted or handled.
Why does the browser show "Your connection to this site is not secure"?
Because direct requests from the browser to the server are being intercepted by Node, making the establishment of a secure connection between them impossible. However, the requests aren't made by the browser, they are made by Node. All https
requests made through Node using this module are secure. This is evidenced by the connection property of the response object:
connection: TLSSocket {
_tlsOptions: {
secureContext: [SecureContext],
requestCert: true,
rejectUnauthorized: true,
},
_secureEstablished: true,
authorized: true,
encrypted: true,
}
The warning can be thought of as a false positive.