get-scraping
v1.1.12
Published
GetScraping.com NodeJS client library
Downloads
6
Readme
GetScraping Node.js Client
This is the official Node.js client library for GetScraping.com, a powerful web scraping API service.
Installation
You can install the GetScraping client library using npm, yarn, or pnpm:
# Using npm
npm install get-scraping
# Using yarn
yarn add get-scraping
# Using pnpm
pnpm add get-scraping
Usage
To use the GetScraping client, you'll need an API key from GetScraping.com. Once you have your API key, you can start using the client as follows:
import { GetScrapingClient } from 'get-scraping';
const client = new GetScrapingClient('YOUR_API_KEY');
async function scrapeWebsite() {
const result = await client.scrape({
url: 'https://example.com',
method: 'GET'
});
const html = await result.text();
console.log(html);
}
scrapeWebsite();
Features
The GetScraping client supports a wide range of features, including:
- Basic web scraping
- JavaScript rendering
- Custom headers and cookies
- Proxy support (ISP, residential, and mobile)
- Retrying requests
- Programmable browser actions
API Reference
GetScrapingClient
The main class for interacting with the GetScraping API.
const client = new GetScrapingClient(api_key: string);
scrape(params: GetScrapingParams)
The primary method for scraping websites.
const result = await client.scrape(params);
GetScrapingParams
The GetScrapingParams
object supports the following options:
export type GetScrapingParams = {
/**
* The url to scrape - should include http:// or https://
*/
url: string;
/**
* The method to use when requesting this url
* Can be GET or POST
*/
method: 'GET' | 'POST';
/**
* The payload to include in a post request.
* Only used when method = 'POST'
*/
body?: string;
/**
* When defined, your GetScraping deployment will route the request through a browser
* with the ability to render javascript and do certain actions on the webpage.
*/
js_rendering_options?: JavascriptRenderingOptions;
/**
* Define any cookies you need included in your request.
* ex: `cookies: ['SID=1234', 'SUBID=abcd', 'otherCookie=5678']`
*/
cookies?: Array<string>;
/**
* The headers to attach to the scrape request. We fill in missing/common headers
* by default — if you want only the headers defined below to be part of the request
* set 'omit_default_headers' to true.
*/
headers?: Record<string, string>;
/**
* omit_default_headers will pass only the headers you define in the scrape request
* Defaults to false.
*/
omit_default_headers?: boolean;
/**
* Set to true to route requests through our ISP proxies.
* Note this may incur additional API credit usage.
*/
use_isp_proxy?: boolean
}
For more detailed information on these parameters, please refer to the GetScraping documentation.
Examples
Basic Scraping
const result = await client.scrape({
url: 'https://example.com',
method: 'GET'
});
const html = await result.text();
console.log(html);
Scraping with JavaScript Rendering
Render Javascript to scrape dynamic sites. Note rendering JS will incur an additional cost (5 requests)
const result = await client.scrape({
url: 'https://example.com',
method: 'GET',
js_rendering_options: {
render_js: true,
wait_millis: 5000
}
});
const html = await result.text();
console.log(html);
Using Various Proxies
Typically the best proxy type for bypassing tough anti-bot measures is mobile, then residential, then ISP, and lastly our default proxies.
We recommend trying request with the default to start and working your way up as necessary as non-default proxies incur additional costs (costs are: 1 request for default proxies, 5 requests for ISP proxies, 25 for residential, and 50 for mobile).
const result = await client.scrape({
url: 'https://example.com',
method: 'GET',
use_residential_proxy: true
});
const html = await result.text();
console.log(html);
Retrying Requests
const result = await client.scrape({
url: 'https://example.com',
method: 'GET',
retry_config: {
num_retries: 3,
success_status_codes: [200]
}
});
const html = await result.text();
console.log(html);
Advanced Usage
For more advanced usage, including programmable browser actions and intercepting requests, please refer to the GetScraping documentation.
Support
If you encounter any issues or have questions, please visit our support page or open an issue in the GitHub repository.
License
This project is licensed under the ISC License. See the LICENSE file for details.