npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

mapsite

v2.1.0

Published

A module to parse urls from a local or remote sitemap.xml

Downloads

499

Readme

Note: Version 2 of this package may differ in results from version 1.x. Mainly because the parser is now using Cheerio

Getting Started

npm install mapsite

or

yarn add mapsite

Usage

const { SitemapParser } = require("mapsite");

const options = {
  rejectInvalidContentType: true,
  userAgent: "customUA",
  maximumRetries: 1,
  maximumDepth: 5,
  timeout: 3000,
  debug: false,
};

const parser = new SitemapParser(options);

With proxy

const { SitemapParser } = require("mapsite");

const parser = new SitemapParser({
  proxy: 'https://username:[email protected]:3000'
});

options

All options are optional, with default fallbacks encoded.

rejectInvalidContentType: boolean;

Checks that the response content-type header MUST be:

  • application/xml
  • application/rss+xml
  • text/xml

default: true


userAgent: string;

Adds a custom User-Agent string to the requests.

default: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36 mapsite/1.0


maximumRetries: number;

How many times a url in the <loc> tag of an XML index file should be requested when response status is not < 400.

default: 1


maximumDepth: string;

How many levels deep should XML index files be traversed. E.g. if index files are nested 3 levels and maximum depth is 2. The last response will not crawl the URLs in the <loc> tag further.

default: 2


timeout: number;

The number of milliseconds allowed for a request to complete, both headers or body will timeout at this point.


debug: boolean;

Logs info, warning and error messages as the parser runs (WIP).



proxy: string;

A URL of a proxy server to proxy the request through.


Methods

run

const parser = new SitemapParser();
const result = await parser.run("https://example.com/sitemap.xml");

result: MapsiteResponse;

The result shape looks as follows:

const result = {
  type: "sitemap",
  urls: ["https://example.com"],
  errors: [
    {
      url: "https://example.com/sitemap-index.xml",
      reason: "Brief description of what went wrong",
    },
  ],
};

fromBuffer

const { readFileSync } = require("fs");
const parser = new SitemapParser();
const buffer = Buffer.from(readFileSync("./sitemap.xml")); // Or a buffer from an uploaded file
const result = await parser.fromBuffer(buffer);

result: MapsiteResponse;

The result shape looks as follows:

const result = {
  type: "sitemap", // or 'index'
  urls: ["https://example.com"],
  errors: [
    {
      url: "buffer",
      reason: "Brief description of what went wrong",
    },
  ],
};