npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

siteweb

v1.5.3

Published

crawl a site to generate a sitemap

Downloads

13

Readme

siteweb

Build Status

siteweb is a tool that can quickly and easily get stats about all the pages on your website. Give it URLs and it will go fetch all of the linked pages and record info about each page. This is useful for testing websites and making sure nothing breaks after deploys for instance. This can also be used to identify the slowest (or fastest) pages on your website.

  • easy to use, just start with a URL
  • runs quickly using isomorphic-fetch and cheeriojs
  • runs on the client and the server
  • concurrency control
  • returns a promise
  • cli args parsed with yargs
  • option to add a delay between requests

Demo

Note: This online demo is still limited by the same origin policy so it the online version may not work with many websites, but the node version does not have this limitation. Feel free to try it out against my blog since it responds with the Access-Control-Allow-Origin:* header.

Also the Demo has a file input to visualize the json structure it generates.

Try it online

Getting Started

npm install -g siteweb
npm install --save-dev siteweb

Usage

Use it via the cli

siteweb http://blog.timscanlin.net
node ./cli http://blog.timscanlin.net

Or use it with the js api

siteweb.run(options, (err, data) => {
  if (err) {
    throw new Error(err)
  }
  process.stdout.write(JSON.stringify(data))
})

Currently it only exposes one run method.

Default Options

module.exports = {
  // Urls to start from.
  startUrls: [
    'http://blog.timscanlin.net'
  ],
  // Limit the number of concurrent requests.
  concurrency: 6,
  // Max queue size.
  maxQueue: 500,
  // Whether to include any external URLs in output.
  includeExternal: true,
  // Whether to fetch the external pages (depends on `includeExternal`)
  fetchExternal: false,
  // Limit of pages to fetch.
  maxPages: 500,
  // Delay between requests in ms.
  delay: 0,
  // Pre fetch callback.
  preFetchCallback: () => {},
  // Post fetch callback.
  postFetchCallback: () => {},
}

Warning

Be careful! This tool recursively fetches all the links on a website. By default it has maxPages set to 500 and concurrency set to 6 but these values are configurable as is the boolean fetchExternal option which will check external pages as well (not recursively). If you change these options siteweb can consume a lot of resources on your computer or other websites so please use with care.

TODO

  • demo page with visualization (more detail)
  • more output options / data?
  • make a similar project using nightmare that can run js