npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

get-scraping

v1.1.12

Published

GetScraping.com NodeJS client library

Downloads

6

Readme

GetScraping Node.js Client

This is the official Node.js client library for GetScraping.com, a powerful web scraping API service.

Installation

You can install the GetScraping client library using npm, yarn, or pnpm:

# Using npm
npm install get-scraping

# Using yarn
yarn add get-scraping

# Using pnpm
pnpm add get-scraping

Usage

To use the GetScraping client, you'll need an API key from GetScraping.com. Once you have your API key, you can start using the client as follows:

import { GetScrapingClient } from 'get-scraping';

const client = new GetScrapingClient('YOUR_API_KEY');

async function scrapeWebsite() {
  const result = await client.scrape({
    url: 'https://example.com',
    method: 'GET'
  });

  const html = await result.text();
  console.log(html);
}

scrapeWebsite();

Features

The GetScraping client supports a wide range of features, including:

  • Basic web scraping
  • JavaScript rendering
  • Custom headers and cookies
  • Proxy support (ISP, residential, and mobile)
  • Retrying requests
  • Programmable browser actions

API Reference

GetScrapingClient

The main class for interacting with the GetScraping API.

const client = new GetScrapingClient(api_key: string);

scrape(params: GetScrapingParams)

The primary method for scraping websites.

const result = await client.scrape(params);

GetScrapingParams

The GetScrapingParams object supports the following options:

export type GetScrapingParams = {
    /**
     * The url to scrape - should include http:// or https://
     */
    url: string;

    /**
     * The method to use when requesting this url
     * Can be GET or POST
     */
    method: 'GET' | 'POST';

    /**
     * The payload to include in a post request.
     * Only used when method = 'POST'
     */
    body?: string;

    /**
     * When defined, your GetScraping deployment will route the request through a browser
     * with the ability to render javascript and do certain actions on the webpage. 
     */
    js_rendering_options?: JavascriptRenderingOptions;

    /**
     * Define any cookies you need included in your request.
     * ex: `cookies: ['SID=1234', 'SUBID=abcd', 'otherCookie=5678']`
     */
    cookies?: Array<string>;

    /**
     * The headers to attach to the scrape request. We fill in missing/common headers
     * by default — if you want only the headers defined below to be part of the request
     * set 'omit_default_headers' to true.
     */
    headers?: Record<string, string>;

    /**
     * omit_default_headers will pass only the headers you define in the scrape request
     * Defaults to false.
     */
    omit_default_headers?: boolean;

    /**
     * Set to true to route requests through our ISP proxies.
     * Note this may incur additional API credit usage.
     */
    use_isp_proxy?: boolean
}

For more detailed information on these parameters, please refer to the GetScraping documentation.

Examples

Basic Scraping

const result = await client.scrape({
  url: 'https://example.com',
  method: 'GET'
});

const html = await result.text();
console.log(html);

Scraping with JavaScript Rendering

Render Javascript to scrape dynamic sites. Note rendering JS will incur an additional cost (5 requests)

const result = await client.scrape({
  url: 'https://example.com',
  method: 'GET',
  js_rendering_options: {
    render_js: true,
    wait_millis: 5000
  }
});

const html = await result.text();
console.log(html);

Using Various Proxies

Typically the best proxy type for bypassing tough anti-bot measures is mobile, then residential, then ISP, and lastly our default proxies.

We recommend trying request with the default to start and working your way up as necessary as non-default proxies incur additional costs (costs are: 1 request for default proxies, 5 requests for ISP proxies, 25 for residential, and 50 for mobile).

const result = await client.scrape({
  url: 'https://example.com',
  method: 'GET',
  use_residential_proxy: true
});

const html = await result.text();
console.log(html);

Retrying Requests

const result = await client.scrape({
  url: 'https://example.com',
  method: 'GET',
  retry_config: {
    num_retries: 3,
    success_status_codes: [200]
  }
});

const html = await result.text();
console.log(html);

Advanced Usage

For more advanced usage, including programmable browser actions and intercepting requests, please refer to the GetScraping documentation.

Support

If you encounter any issues or have questions, please visit our support page or open an issue in the GitHub repository.

License

This project is licensed under the ISC License. See the LICENSE file for details.