npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

verbal-id

v0.0.4

Published

JS library to generate unique identifiers consisting of words that are pronounceable, distinguishable and inclusive.

Downloads

17

Readme

verbal-id

JS library to generate unique identifiers consisting of words that are pronounceable, distinguishable and inclusive.

Getting Started

Installation

$ npm install verbal-id

Importing

const verbalid = require('verbal-id');

Usage


> verbalid.create()
'vacant brand orchestra kiwi'

> verbalid.parse("vacant brand orchestra kiwi")
'8aab9b999'

> verbalid.create("8aab9b999")
'vacant brand orchestra kiwi'

> verbalid.parse("vacant brand orchestra dragon")    // valid words, but checksum is incorrect
undefined

Introduction

Why a new library?

I wanted a unique identifier that my project's users could transfer verbally to each other without ambiguity. Existing projects like WCodes have provided a similar idea, but the word list contains homophones and words with negative connotations.

I also wanted the system to be tolerant to minor typos, which necessitated adopting a word list which is interally unique using phonetic codes generated using Metaphone.

How many bits are each ID?

The library generates four-word identifiers that represent 36-bit numbers with 4 bits of parity (calculated with the BSD checksum algorithm) for a total of 40 bits of data, 10 per word.

What's the methodology you used to create the word list?

From a starting point of the public domain WCodes list, I:

  • Manually removed inappropriate or difficult to spell words
  • Filtered out homophones using the comp.speech homophones-1.01.txt list.
  • Used natural to ensure every word on the list has a unique Metaphone code, removing words with similar pronunciations.
  • Manually added ~400 new words from the 12dicts 6of12 list to replenish the (significant) removals up to 1024 entries.

I discarded a few limitations WCodes adopted; most notably, this list permits common compound words that aren't typically hyphenated like 'landscape' and 'moonlight'.

Why is the word list order randomized?

If an inappropriate or confusing word has slipped through the filter process, I plan to deprecate the word in future revisions of the package. This package will guarantee backwards compatibility with reading previously generated codes, so any deprecated words will be implemented as alternates that are still recognized when decoding an ID.

This implies that the words shouldn't be presented in alphabetical order in case we need to make a swap later! It also reduces the chances of a small off-by-one error being caused by lexically similar words.

How are bits apportioned to the words?

  //   "word1"   "word2"   "word3"   "word4"
  // | chunk1 || chunk2 || chunk3 || chunk4 |
  // 543210987654321098765432109876543210cccc
  // |        36 bit ID number          |{  } {checksum}

This is English centric.

Yup, for now, at least. Send me a pull request including a word list with similar qualities for another language, and I'll cheerfully include it in the package, though.

When translating, don't try to directly translate all the words from the English list - you should choose words in the target language that are distinct in pronunciation.

I have questions or suggestions.

[email protected] or open an issue!