npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

dawg-set

v0.0.0

Published

Javascript directed acyclic word graph (DAWG)

Downloads

22

Readme

Dawg-Set

Javascript directed acyclic word graph (DAWG) library. DAWGs allow efficient storage of large sets of strings by compressing the storage graph.

$ npm install dawg-set --save
import Dawg from 'dawg-set';

Documentation

A DAWG stores a set of strings of letters in a graph, reusing nodes where possible. Dawg-set can be used to store regular strings where each node in the graph is a single character or, more generally, "strings" where each component of the string is another string. The latter approach is useful because it allows you to store sentences in the DAWG at a word level, instead of storing them as individual characters.

When constructing the DAWG, you must insert items in lexicographic order. For simple strings, this is just dictionary order. You can .sort() a Javascript array to put it in lexicographic order if you are not sure.

new Dawg(stringIterator, options)

Top level DAWG class.

  • stringIterator - Optional iterator of strings used to populate initial set. Strings must be in lexicographic order.
  • options - Optional object specifying behavior of the data structure. Must defined a single member, compare which, performs a lexicographic comparison of two strings.
const empty_dawg = new Dawg();

const basic_char_dawg = new Dawg(['poodle', 'puli', 'pug']);

const basic_string_dawg = new Dawg([
    ['bull', 'dog'],
    ['bull', 'terrier']]);

Dawg.from(stringIterator, options)

Same as new Dawg(stringIterator, options), but freezes the dawg before returning. See Dawg.prototype.freeze

Dawg.prototype.count()

Get the number of entries stored in the DAWG. DAWGs cannot contain duplicate entires.

Dawg.from([]).count() === 0

Dawg.from(['poodle', 'puli', 'pug']).count() === 3

new Dawg([
    ['bull', 'dog'],
    ['bull', 'terrier']]
).count() === 2

Dawg.prototype.add(string)

Add a string to the DAWG. Returns the DAWG.

add must be called in lexicographic order. If you fail to do so, an error is thrown. You cannot update a frozen DAWG.

const d = new Dawg()
d.add('collie').add('corgi')

d.count() === 2
d.has('collie') === true
d.has('lab') === false
d.has('co') === false

d.add('lab')
d.count() === 3
d.has('lab') === true

// Trying to insert the same key again has no effect
d.add('lab')
d.count() === 3

// Trying to insert out of order throws an error
d.add('afghan hound') === throws since 'Afghan Hound' comes before 'lab'

// Trying to add to a frozen DAWG throws an error
const set = Dawg.from([])
set.add('collie') === throws since `set` is frozen by `from`.

Dawg.prototype.freeze()

Locks the DAWG and prevents further mutation. Once you are done mutating the DAWG, freezing it can free up some memory.

Dawg.from automatically freezes its result.

const d = new Dawg()
d.add('collie').add('corgi').freeze()

d.count() === 2
d.has('collie') === true

d.add('lab') === throws since graph is frozen now

Dawg.prototype.has(string)

Does the set contain an entry for string?

const d = Dawg.from(['poodle', 'puli', 'pug']);

d.has('puli') === true
d.has('mastiff') === false

// Storage is sensitive
d.has('PULI') === false

Dawg.prototype.match(string)

Finds the longest string in the DAWG that is part of string. The result will be between 0 string.length long. Only matches complete words in the DAWG.

const d = Dawg.from(['a', 'abc', 'abcde']);

d.match('a') === 'a'
d.match('x') === ''
d.match('abx') === 'a'
d.match('abcd') === 'abc'
d.match('abcdx') === 'abc'
d.match('abcdefg') === 'abcde'

Dawg.prototype.longest(string)

Same behavior as match, but returns the length of the longest match instead of the match itself.

One use of this is to get the matched vs unmatched parts of the input string:

const d = Dawg.from(['a', 'abc', 'abcde'])

const path = 'abcdx'
const i = d.longest(path)
const matched = path.slice(0, i)
const unmatched = path.slice(i)

matched === 'abc'
unmatched === 'dx'

Dawg.prototype.values(joiner = '') Dawg.prototype[Symbol.iterator]

Get an iterator to all strings in the dawg.

  • joiner - Optional. String or function used to join the letters of the strings together. If a string, this is inserted between each letter. If this is a function, it is invoked with the left and right parts of the string to join and must return a string.
const d = Dawg.from(['a', 'abc', 'abcde'])

Array.from(d.values()) === Array.from(d) === ['a', 'abc', 'abcde']

// Using a custom joiner
Array.from(d.values('|')) === ['a', 'a|b|c', 'a|b|c|d|e']

// Using with word level DAWG
const wd = Dawg.from([
    ['bull', 'dog'],
    ['bull', 'terrier']]);

Array.from(wd.values()) === Array.from(wd) === ['bulldog', 'bullterrier']

Array.from(wd.values(' ')) === ['bull dog', 'bull terrier']

Dawg.prototype.paths()

Get an iterator to all letter paths in the dawg.

Unlike values(), this returns arrays of characters.

const d = Dawg.from(['a', 'abc', 'abcde'])

Array.from(d.paths()) === [['a'], ['a', 'b', 'c'], ['a', 'b', 'c', 'd', 'e']]

// Using with word level DAWG
const wd = Dawg.from([
    ['bull', 'dog'],
    ['bull', 'terrier']]);

Array.from(wd.paths()) === [['bull', 'dog'], ['bull', 'terrier']];

Dawg.prototype.valuesStartingWith(string, joiner = '')

Same basic behavior as values() but only returns values that start with string. string itself does not have to be in the set.

const d = Dawg.from(['a', 'abc', 'abcde', 'abe', 'd'])

Array.from(d.valuesStartingWith('abc')) ===  ['abc', 'abcde']

Credits

Implementation based on https://gist.github.com/smhanov/94230b422c2100ae4218