npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

haberdasher

v1.0.0

Published

A consistent-hash ring based on farmhash and redblack.js

Downloads

23

Readme

haberdasher

A consistent-hash ring based on farmhash and redblack.js.

The expectation is that you will store key/values but look them up by another value that you want an evenly random distributions across.

It also includes a hashed queue implementation to shard tasks across queues so that you can guarantee mutual exclusion of asynchronous task execution based on id.

Build Status Coverage Status Version npm npm Downloads Dependencies

Concepts

haberdasher uses a red-black tree to keep insert, lookup, and delete operations O(log n). It defaults to 100 virtual nodes per key/value pair inserted into the ring. The more virtual nodes you use per key, the more evenly distributed gets will be across key/value pairs. The trade-off will be slightly slower access times.

haberdasher uses farmhash to hash keys quickly and with little-to-no collision. In practicality, in the unlikely event collisions do occur they have impercetible impact on distribution.

Keys should be strings or numbers. Values can be anything you would like statistically even distribution across.

Use Cases

Sharding is the most common application I have for this module. Because I often use this to shard tasks across possible asyncronous runners without collision, this module includes an implementation of a work queue that will prevent parallel execution of asynchronous tasks against the same id.

Use

Hashring

const ring = require('haberdasher').ring;

// create a hash ring with 100 vnodes per key
const hash = ring.create();

// create a hash ring with custom number of vnodes per key
const hash2 = ring.create(1000);

Hashqueue

const queue = require('haberdasher').queue;
const tasks = queue.create(2); // number of queues

// no more than 2 tasks will run at any one time
// tasks will be mapped to worker queue based on the hash of their id
Promise.all([
  tasks.add(1, function() { console.log( 1 ); }),
  tasks.add(2, function() { console.log( 2 ); }),
  tasks.add(3, function() { console.log( 3 ); }),
  tasks.add(4, function() { console.log( 4 ); })
])
.then(results => {
  // results will be empty since the tasks aren't returning anything
});

Hashring API

The API is very simple. Add and get operations return synchronously thanks to farmhash's simple API.

add(key, value)

Add a key/value pair to the hash.


hash.add('key1', { name: 'one' });
hash.add('key2', { name: 'one' });
hash.add('key3', { name: 'one' });

get(lookup)

Returns a value based on how the hashed key maps to the hash ring.

const value = hash.get('bacon');

remove(key)

Removes the key/value pair from the hash ring.

hash.remove('key3');

Hashqueue API

create(concurrencyLimit)

Creates a new hashqueue with the specified concurrency limit. The default limit is 4. Higher limits are better as the set of ids increase. Long-running tasks can slow things down, especially with lower limits.

const queue = hashqueue.create(2);

add(id, task)

Add a task to the queue. Returns a promise that will resolve to the value returned from the task when the task has completed.

IMPORTANT: A promise must be returned from tasks that are asynchronous.

// this example will print 'tada' after roughly 100 ms.
const promise = queue.add(100, () => {
  return new Promise(resolve => {
    setTimeout(() => {
      resolve('tada');
    }, 100);
  });
});
promise.then(console.log);

stop()

Stops all task runners. Ideally only used before disposal of the queue; a use case that will generally come up in testing.

Distribution

The default number of vnodes (100) seems adequate to provide a standard deviation of 10% across key/value pairs. This may vary depending on the number of reads and keys used during get operations.

Rough performance numbers

Of course, YMMV dependening largely on the processor. On a 2.6 GHz core i7, it takes 18 microseconds (on average) to lookup a key in a hashring with 10,000 virtual nodes. (in this case a hash with 10 key value/pairs and 1000 vnodes each) This performance seems to hold steady under constant load.

Why Not Modulo?

Modulo can work well as a simple way to map a numeric key to an array of (databases|servers|work queues|?) but what happens when the list of available things to map fluctuates. If you're not very careful about how this is implemented you will end up with unexpected results - not to mention that you will need to ensure that every instance/node maps identically despite variance in environment or input order. In other words - there is a lot of risk to how the modulo index maps to available items in the array.

To support non-integer keys, you'll have to do a fair amount of massaging to get other types of keys to work with this approach.

Once you've put in the additional key processing, and a great deal of safeguards to ensure all nodes will create a consistent map, you've basically reinvented an arguably more complex, and perhaps less efficient, means of sharding.

Note About The Name

Yes, a haberdasher has nothing to do with hashing or computers. My grandfather was a haberdasher and it's just a fun word to say. Given it's rare use in our language, I wasn't worried that anyone would think that I'd created a module that tailored/sold upscale men's clothing.