npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

lazy-aho-corasick

v1.2.2

Published

A simple Aho-Corasick library using the latest ES6 modules and allowing user-specifiable options. The fastest string-matching algoithm for a fixed number of pre-known substrings.

Downloads

20

Readme

LAZYahocorasick

The Aho-Corasick algorithm is the fastest string-matching algorithm for a set of substrings of a fixed number that are known beforehand. Its runs at O(n) for each new string processed, where n indicates the length of the string in number of characters. This is accomplished by first building a 'trie' or 'tree' from the number of substrings that are to be matched to the string. Each substring is broken down into constituent characters and a trie built up. The trie includes the 'child' characters that follow (stored in the 'dict' object of the script) and the 'suffix' characters that list the identical suffixes at any point on the trie (stored in the 'suffix' object of the script). You can read more about the algorithm here

This build of the Aho-Corasick algorithm has three main priorities:

  • Ease of use for the user
  • Number of features built in for the user
  • Speed using the latest ES6 methods
  • Support for typescript (written in typescript, built with index.d.ts files)

To use the library, simply build the trie using the new keyword, then use the .search() method to begin searching within a string. A typical implementation might look like this:

const lazyac = require('lazy-aho-corasick');

let substrings = [
    'a',
    'ab',
    'bab',
    'bc',
    'bca',
    'c',
    'caa'
];
let string = 'abccab';

const trie = new lazyac(substrings)
console.log(trie.search(string)); //[ 'a', 'ab', 'bc', 'c', 'c', 'a', 'ab']

This build also lets you specify a number of options with the initial build, which so far are:

  • allowDuplicates: (boolean) set to false to skip duplicate substrings on the input. A substring inputted twice that appears twice will only be listed twice. A substring inputted 3 times that appears 5 times will be listed 5 times. Should be used on the trie when dealing with a large number of substrings - only checking the start vastly improves trie build time.
  • startOnly: (boolean) a reduce from of Aho-Corasick, which just checks for substrings at the start of the string. As soon as a failure is introduced, the substring becomes null (no matches can be found from the middle of the string).

Specify options using an object as the second parameter, ex:

const lazyac = require('lazy-aho-corasick');

const trie = new lazyac(['b', 'c', 'aa', 'd', 'b'], {
    "allowDuplicates": false,
    "startOnly": false
})

console.log(trie.search('caaab')); //[ 'c', 'aa', 'aa', 'b' ]

Options can also be specified on an individual level for incoming strings:

  • positions: (boolean) returns each found substring with its position (of last character) in the string. The found substrings are returned as nested Arrays, where the index of each represents the character position at which that set of substrings was found.
  • startOnly: (boolean) only checks the start of the string. Does not improve runtime or build time; it is for running multiple strings past a trie, some of which require a start-only parameter to be true
  • return: (string: all|first|last|shortest|longest; def: all) whether to reduce the output value. all returns all substrings found in the string, the rest only return one.

Do this by using an object as the second parameter on the .search() method, ex:

const lazyac = require('./main');
let trie = new lazyac(['a', 'b', 'c', 'aa', 'd', 'b'])

console.log(trie.search('caaab', {
    "positions": true
}));    //[ [ 'c' ], [ 'a' ], [ 'aa', 'a' ], [ 'aa', 'a' ], [ 'b', 'b' ] ]

v.1.0.0