npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

chrome-web-store-scraper

v1.1.0

Published

A scraper for the Google Chrome extension metadata on the chrome web store

Downloads

90

Readme

CircleCI npm version GitHub license

chrome-web-store-scraper

A node js package for scraping the chrome web store.

Requirements

This project requires selenium, a Web Browser Automation tool. The latest version of the Selenium Standalone Server can be downloaded from seleniumhq.

Selenium Server must also be installed as selenium on the system PATH. For linux, a selenium bash script is included that can be paired with the selenium.jar for ease of use.

The selenium-webdriver npm package has some details on what is required.

Selenium Setup

The Selenium server must be on the system path as 'selenium' the easiest way to set it up to work with the chrome web store scraper is to make the selenium bash script (that is included this project) an executable with chmod +x selenium and then copy that file, along with the selenium server .jar file to /bin/ or somewhere similar.

When copying the selenium server .jar make sure it is renamed from selenium-server-standalone-3.14.0.jar or whatever it is currently called, to just selenium.jar.

chromedriver

As well as selenium, you're going to need the latest chromedriver installed.

chrome-browser-stable

A chrome browser is also required. you can get the latest chrome-broswer-stable from chromium.

How To Use

You can use this to scrape search results for chrome extensions, or to scrape store information for a specific extension.

To include the scraper in your project:

const ChromeWebScraper = require('chrome-web-store-scraper');
const scraper = new ChromeWebScraper();

Search

The most basic search just requires you to provide a search term.

    scraper.search('some-search-term').then(
        (res) => console.log(res),
        (err) => console.log(err)
    );

Example Response

[
  {
    "title": "Data Scraper - Easy Web Scraping",
    "description": "Data Scraper extracts data out of HTML web pages and imports it into Microsoft Excel spreadsheets",
    "author": "",
    "category": "Productivity",
    "rating": 4.107231920199501,
    "numberOfRatings": 401,
    "storeURL": "https://chrome.google.com/webstore/detail/data-scraper-easy-web-scr/nndknepjnldbdbepjfgmncbggmopgden"
  },
  ...
]

there are additional options than can be used to perform a more directed search

Search categories

You can provide as a category which the scraper will then use when building a search request, only one category can be provided.

Valid Categories

all
accessibility
blogging
byGoogle
developerTools
fun
newsAndWeather
photos
productivity
searchTools
shopping
socialAndCommunication
sports

Categories can be provided in an options JSON object as demonstrated below:

const options =  {searchCategory : 'newsAndWeather'}

scraper.search('searchString', options).then(
    (res) => console.log(res),
    (err) => console.log(err)
);

Search Features

Search features can be provided as a means of specifying select features that a chrome extension must have.

Passed as an array of strings in an options JSON object, the features can be any combination of the following:

offline
byGoogle
free
android
googleDrive

features can be provided in an options JSON object as demonstrated below:

const options =  {
    searchFeatures : ['free', 'offline','byGoogle']
}

scraper.search('searchString', options).then(
    (res) => console.log(res),
    (err) => console.log(err)
);

Features and Categories

Searching can be performed with categories and features together in the same options JSON object.

const options =  {
    searchCategory : 'newsAndWeather',
    searchFeatures : ['free', 'byGoogle']
}

scraper.search('searchString', options).then(
    (res) => console.log(res),
    (err) => console.log(err)
);

Search Options

As well as the categories and feature filters, additional options in the form of locale and scrollAttempts can also be used.

The scrollAttempts option is used to specify how many attempts are made to retrieve additional search results, by scrolling down the page loaded by selenium. the larger the number, the more the page will be scrolled down. It is worth noting that each scroll attempt is paired with a 50ms wait, so a number excessively large will result in longer processing times.

The locale option can be set by passing a locale string as an option. The locale string is used as the hl url option by the chrome extension store, to scrape search results in french, danish, or italian, locale strings 'fr', 'da', or 'it' could be used.

Example Search with locale and scrollAttempts

scraper.search('scraper',{scrollAttempts:200, locale:'da'}).then(
    (res) => console.log(res[0]),
    (err) => console.log(err)
);

Extension Scraping

In order to scrape the store page for a specific chrome extension, this scraper requires a direct url to that page. These urls are to be passed as a parameter to the scrapeApp function.

scraper.scrapeApp('url-to-some-app').then(
    (res) => console.log(res),
    (err) => console.log(err)
);

Example Response

{
    "header": {
        "title": "Autosave webpage",
        "offeredBy": "offered by mtcutler1",
        "userCount": "48",
        "rating": "3.5",
        "ratingCount": 4,
        "imgURL": "https://lh3.googleusercontent.com/4jyS9mGYDUFs2KL52Xfg_I9EzkUIzlCboTp5Dvqv-vKrUWhoz9tNCWR4lPfNFneM2JFmgNrkCkc=w26-h26-e365"
    },
    "overview": {
        "summary": "Save ... a scheduled…",
        "description": "Save ... stay updated",
        "version": "0.1",
        "lastUpdatedDate": "January 24, 2018",
        "size": "178KiB",
        "language": "English (United States)",
        "screenshotURLs": [
            "https://lh3.googleusercontent.com/nBXzgn-La5s3HyynhHWmnJwAasC1KUMK8GfqCVnOqL-CEGhLOcVNGaNPYUQBv180-ypWPQN2xc8=w640-h400-e365",
            "https://lh3.googleusercontent.com/nBXzgn-La5s3HyynhHWmnJwAasC1KUMK8GfqCVnOqL-CEGhLOcVNGaNPYUQBv180-ypWPQN2xc8=w640-h400-e365",
            "https://lh3.googleusercontent.com/nBXzgn-La5s3HyynhHWmnJwAasC1KUMK8GfqCVnOqL-CEGhLOcVNGaNPYUQBv180-ypWPQN2xc8=w120-h90-e365"
        ],
        "additionalInfo": []
    },
    "reviews": [
    {
        "displayName": "Jeffrey",
        "profileImageURL": "//www.gstatic.com/s2/contacts/images/NoPicture.gif",
        "displayNameURL": "https://plus.google.com/110338040265199312388",
        "timestamp": "Modified Mar 21, 2018",
        "ratingString": "4 stars (Liked it)",
        "rating": 4,
        "comment": "Seemed to only work with one tab...would be perfect if it works on multiple tabs simultaneously"
    }
    ]
}
}