npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

synchronous-autocomplete

v3.0.0

Published

Fast, simple autocompletion.

Downloads

295,821

Readme

synchronous-autocomplete

Fast, simple autocompletion. Also supports Levenshtein-based fuzzy search. Uses precomputed indexes to be fast.

npm version ISC-licensed support me via GitHub Sponsors chat with me on Twitter

Installing

npm install synchronous-autocomplete

Usage

Let's build a simple search for our fruit stand. We assign a weight property to each of them because some are bought more often and we want to push their ranking in the search results.

const items = [ {
	id: 'apple',
	name: 'Juicy sour Apple.',
	weight: 3
}, {
	id: 'banana',
	name: 'Sweet juicy Banana!',
	weight: 2
}, {
	id: 'pome',
	name: 'Sour Pomegranate',
	weight: 5
} ]

Let's understand the terminology used by this tool:

  • item: A thing to search for. In our example, apple, banana and pomegranate each are an item.
  • weight: How important an item is.
  • token: A word from the fully normalized item name. For example, to find an item named Hey There!, you may process its name into the tokens hey & there.
  • fragment: A word from the normalized search query, which may partially match a token. E.g. the fragment ther (from the search query Hey Ther) partially matches the token there.
  • relevance: How well an item fits to the search query.
  • score: A combination of an item's weight and relevance. Used to rank search results.

In order to be as fast and disk-space-efficient as possible, synchronous-autocomplete requires five indexes to be prebuilt from the list of items. Check the example code for more details on how to build them. For our example, they would look like this:

const tokens = { // internal item IDs, by token
	juicy: [0, 1],
	sour: [0, 3],
	apple: [0],
	sweet: [1],
	banana: [1],
	pomegranate: [3]
}
const weights = [ // item weights, by internal item ID
	3, // apple
	2, // banana
	5 // pome
]
const nrOfTokens = [ // nr of tokens, by internal item ID
	3, // apple
	3, // banana
	2 // pome
]
const scores = { // "uniqueness" of each token, by token
	juicy: 2 / 3, // 2 out of 3 items have the token "juicy"
	sour: 2 / 3,
	apple: 1 / 3,
	sweet: 1 / 3,
	banana: 1 / 3,
	pomegranate: 1 / 3
}
// In order to create smaller search indexes, we use numerical item IDs
// internally and maintain a mapping to their "real"/original IDs.
const originalIds = [
	'apple',
	'banana',
	'pome'
]

Next, we must define a function that normalizes search input into a list of fragments. Consider using this simple function:

import normalize from 'normalize-for-search'

const tokenize = (str) => {
	return normalize(str).replace(/[^\w\s]/g, '').split(/\s+/g)
}

Of course, you don't have to calculate the tokens & scores! Instead, use buildIndex to generate the data:

import {buildIndex} from 'synchronous-autocomplete/build.js'

const index = buildIndex(tokenize, items)

Now, we can query our index:

import {createAutocomplete} from 'synchronous-autocomplete'

const autocomplete = createAutocomplete(index, tokenize)

autocomplete('bana')
// [ {
// 	relevance: 0.6666665555555555,
// 	score: 0.8399472266053544,
// 	weight: 2,
// } ]

autocomplete('sour')
// [ {
// 	id: 'pome',
// 	relevance: 1.8333335,
// 	score: 3.134956187236602,
// 	weight: 5,
// }, {
// 	id: 'apple',
// 	relevance: 1.2222223333333333,
// 	score: 1.762749635070118,
// 	weight: 3,
// } ]

autocomplete('aplle', 3, true) // note the typo
// [ {
// 	id: 'apple',
// 	relevance: 0.22222216666666667,
// 	score: 0.3204998243877813,
// 	weight: 3,
// } ]

API

const index = buildIndex(tokenize, items)
const {tokens, scores, weights, nrOfTokens, originalIds} = index
  • tokenize must be a function that, given a search query, returns an array of fragments.
  • items must be an array of objects, each with id, name & weight.
const autocomplete = createAutocomplete(index, tokenize)
autocomplete(query, limit = 6, fuzzy = false, completion = true)
  • tokens must be an object with an array of internal item IDs per token.
  • scores must be an object with a token score per token.
  • weights must be an array with an item weight per internal item ID.
  • nrOfTokens must be an array with the number of tokens per internal item ID.
  • originalIds must be an array with the (real) item ID per internal item ID.
  • tokenize is the same as with buildIndex().

Storing the index as protocol buffer

Protocol buffers (a.k. protobufs) are a compact binary format for structured data serialization.

import {encodeIndex} from 'synchronous-autocomplete/encode.js'
import {writeFileSync, readFileSync} from 'node:fs'

// encode & write the index
const encoded = encodeIndex(index)
writeFileSync('index.pbf', encoded)

// read & decode the index
const decoded = decode(readFileSync('index.pbf'))

Contributing

If you have a question or have difficulties using synchronous-autocomplete, please double-check your code and setup first. If you think you have found a bug or want to propose a feature, refer to the issues page.