npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

typesense-sitemap

v2.0.0

Published

This is a node library allowing you to generate sitemaps from a Typesense collection. It is inspired by [algolia-sitemap](https://github.com/algolia/algolia-sitemap).

Downloads

31

Readme

Typesense Sitemap Generator

Codacy Badge

This is a node library allowing you to generate sitemaps from a Typesense collection. It originally started as a fork of algolia-sitemap, but ended up becoming a complete rewrite due to the ways Typesense and Algolia differ in exporting data.

Requires node v12+.

It will create sitemaps in a directory of your choosing (for example /sitemaps). For collections less than 50,000 records, it will generate a single sitemap at /<output-dir>/sitemap.xml. For collections larger than 50,000 records, it will generate multiple sitemaps (/<output-dir>/sitemap.1.xml, /<output-dir>/sitemap.2.xml, etc) and a root sitemap at /<output-dir>/sitemap.xml that you can give to Google to index all of the pages.

Table of Contents

How does it work?

  1. The entire Typesense collection is exported in a stream
  2. Every 50,000 documents, a sitemap.n.xml is generated in the chosen directory (where n is the sitemap number)
  3. Once all of the sitemaps have been generated, a final sitemap.xml is generated that links to all of the other xml files
  4. Notify search engines know about the sitemap.xml either by letting them know or putting it in robots.txt

This process is a script that should be ran periodically to keep the sitemaps up to date.

Usage

Getting Started

First install the module from npm (or with yarn):

$ npm install typesense-sitemap --save[-dev]
$ yarn add typesense-sitemap [--dev]

Then bring the library into your project

// Import statement
import typesenseSitemap from 'typesense-sitemap';
// CommonJS
const typesenseSitemap = require('typesense-sitemap').default;

typesenseSitemap({
    typesenseConfig: {
        apiKey: 'xxxxx', // must be an admin key
        nodes: [
            {
                host: 'xxxxx',
                port: 443,
                protocol: 'https',
            },
        ],
    },
    collectionName: 'xxxxx',
    sitemapLoc: 'https://www.mywebsite.com/<output-dir>',
    outputFolder: '<output-dir>',
    docTransformer: (doc) => {
        return {
            loc: `https://www.mywebsite.com/categories/${doc.name}`,
            lastmod: new Date().getTime(),
            priority: 0.8,
            changefreq: 'monthly',
        };
    },
    ////// optional params ///////
    params: {
        filter_by: '<any-filters-you-want>',
        include_fields: '<fields-to-include>',
        exclude_fields: '<fields-to-exclude>',
    },
    sitemapName: 'my-awesome-sitemap', // defaults to "sitemap"
    minifyOutput: true, // defaults to true
    emitLogs: true, // defaults to true
    linksPerSitemap: 50000, // default is 50,000. This value should not be greater than 50,000 as per Google's sitemap guidelines
});

Transforming Documents

The docTransformer parameter is a function that transforms a collection document into a sitemap item. This function can return a sitemap item or false. Returning false will exclude that document from the sitemap. When returning a sitemap item the only required field is loc.

The alternates.langToURL parameter is a function that takes a language code and returns a url string for the alternate language URL. It can also return false to skip a particular language.

function docTransformer(doc) {
    const loc = `https://www.yoursite.com/objects/${doc.id}`;
    const lastmod = new Date().toISOString();
    const priority = Math.random();
    return {
        loc, // only required field
        lastmod,
        priority,
        alternates: {
            languages: ['fr', 'pt-BR', 'zh-Hans'],
            langToURL: (lang) =>
                `https://www.yoursite.com/${lang}/objects/${doc.id}`,
        },
        images: [
            {
                loc: doc.photo,
                title: doc.photoTitle,
            },
        ],
    };
}

SitemapItem Fields

The docTransformer method should always return a SitemapItem or false (to skip that document in the sitemap). You can find the type definition for SitemapItem below.

interface SitemapItem {
    // URL of the page
    loc: string;
    // when the page was last modified. Accepts a date, ISO date string, or milliseconds
    lastmod?: Date | number | string;
    // how often the page changes
    changefreq?:
        | 'always'
        | 'hourly'
        | 'daily'
        | 'weekly'
        | 'monthly'
        | 'yearly'
        | 'never';
    // this page's priority (goes from 0 to 1). Default is 0.5
    priority?: number;
    // array of images related to this page (see typedef below)
    images?: SitemapItemImage[];
    // alternative versions of this page (useful for multi-lingual sites)
    alternates?: {
        // array of language codes that are enabled
        languages: string[];
        // function that takes a language code and returns the url of the translated page
        langToURL: (language: string) => string;
    };
}

// Typedef for SitemapItemImage
// For more information on images in sitemaps
// see https://support.google.com/webmasters/answer/178636?hl=en
interface SitemapItemImage {
    loc: string;
    title?: string;
    caption?: string;
    geo_location?: string;
    license?: string;
}

Filtering Results

You can pass a params parameter to typesenseSitemap. This allows you to narrow down the outputed results. For a complete list of available params visit: https://typesense.org/docs/0.21.0/api/documents.html#export-documents

typesenseSitemap({
    // rest of config
    params: {
        filter_by: 'rating:>50',
        // be aware that this will affect data that is passed into docTransformer
        exclude_fields: 'photo,description',
    },
});

Memory Considerations

When this script is running, up to 50,000 records will be stored in memory. For most users this is a non-issue since records are typically small. However, since Typesense has no limit on record sizes there's a feasible scenario where running this process creates memory issues. If you start running into this issue because you have large records you can use the exclude_fields or include_fields params to reduce the size of the data that is exported from Typesense.

typsenseSitemap({
    /// rest of config
    params: {
        // only grab what you need for producing a sitemap
        include_fields: 'id,name,description',
    },
});

License

MIT