npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

scrapingbee

v1.7.5

Published

ScrapingBee Node SDK

Downloads

47,405

Readme

ScrapingBee Node SDK

ScrapingBee is a web scraping API that handles headless browsers and rotates proxies for you. The Node SDK makes it easier to interact with ScrapingBee's API.

Installation

You can install ScrapingBee Node SDK with npm.

npm install scrapingbee

Usage

The ScrapingBee Node SDK is a wrapper around the axios library. ScrapingBee supports GET and POST requests.

Signup to ScrapingBee to get your API key and some free credits to get started.

Making a GET request

const scrapingbee = require('scrapingbee');

async function get(url) {
    var client = new scrapingbee.ScrapingBeeClient('REPLACE-WITH-YOUR-API-KEY');
    var response = await client.get({
        // The URL you want to scrape
        url: url,
        params: {
            // Block ads on the page you want to scrape
            block_ads: false,
            // Block images and CSS on the page you want to scrape
            block_resources: true,
            // Premium proxy geolocation
            country_code: '',
            // Control the device the request will be sent from
            device: 'desktop',
            // Use some data extraction rules
            extract_rules: { title: 'h1' },
            // Wrap response in JSON
            json_response: false,
            // JavaScript scenario to execute (clicking on button, scrolling ...)
            js_scenario: {
                instructions: [
                    { wait_for: '#slow_button' },
                    { click: '#slow_button' },
                    { scroll_x: 1000 },
                    { wait: 1000 },
                    { scroll_x: 1000 },
                    { wait: 1000 },
                ],
            },
            // Use premium proxies to bypass difficult to scrape websites (10-25 credits/request)
            premium_proxy: false,
            // Execute JavaScript code with a Headless Browser (5 credits/request)
            render_js: true,
            // Return the original HTML before the JavaScript rendering
            return_page_source: false,
            // Return page screenshot as a png image
            screenshot: false,
            // Take a full page screenshot without the window limitation
            screenshot_full_page: false,
            // Transparently return the same HTTP code of the page requested.
            transparent_status_code: false,
            // Wait, in miliseconds, before returning the response
            wait: 0,
            // Wait for CSS selector before returning the response, ex ".title"
            wait_for: '',
            // Set the browser window width in pixel
            window_width: 1920,
            // Set the browser window height in pixel
            window_height: 1080,
        },
        headers: {
            // Forward custom headers to the target website
            key: 'value',
        },
        cookies: {
            // Forward custom cookies to the target website
            name: 'value',
        },
        // `timeout` specifies the number of milliseconds before the request times out.
        // If the request takes longer than `timeout`, the request will be aborted.
        timeout: 10000, // here 10sec, default is `0` (no timeout)
    });

    var decoder = new TextDecoder();
    var text = decoder.decode(response.data);
    console.log(text);
}

get('https://httpbin-scrapingbee.cleverapps.io/html').catch((e) => console.log('A problem occurs : ' + e.message));

/* -- output
    <!DOCTYPE html><html lang="en"><head>...
*/

ScrapingBee takes various parameters to render JavaScript, execute a custom JavaScript script, use a premium proxy from a specific geolocation and more.

You can find all the supported parameters on ScrapingBee's documentation.

You can send custom cookies and headers like you would normally do with the requests library.

Screenshot

Here a little exemple on how to retrieve and store a screenshot from the ScrapingBee blog in its mobile resolution.

const fs = require('fs');
const scrapingbee = require('scrapingbee');

async function screenshot(url, path) {
    var client = new scrapingbee.ScrapingBeeClient('REPLACE-WITH-YOUR-API-KEY');
    var response = await client.get({
        url: url,
        params: {
            screenshot: true, // Take a screenshot
            screenshot_full_page: true, // Specify that we need the full height
            window_width: 375, // Specify a mobile width in pixel
        },
    });

    fs.writeFileSync(path, response.data);
}

screenshot('https://httpbin-scrapingbee.cleverapps.io/html', './httpbin.png').catch((e) =>
    console.log('A problem occurs : ' + e.message)
);

Retries

The client includes a retry mechanism for 5XX responses.

const spb = require('scrapingbee');

async function get(url) {
    let client = new spb.ScrapingBeeClient('REPLACE-WITH-YOUR-API-KEY');
    let resp = await client.get({ url: url, params: { render_js: false }, retries: 5 });

    let decoder = new TextDecoder();
    let text = decoder.decode(resp.data);
    console.log(text);
}

get('https://httpbin-scrapingbee.cleverapps.io/html').catch((e) => console.log('A problem occured: ' + e.message));