npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

hyperlolo

v0.4.0

Published

Efficiently estimate the cardinality of a set in Typescript and Javascript

Downloads

344

Readme

hyperlolo

Build npm version License

Description

A basic implementation of HyperLogLog, a probabalistic datastructure useful for approximating the number of distinct elements in a multiset. This approach allows for O(1) time complexity and O(log log n) space complexity w.r.t set size n.

Features

  • Custom Backends: Implement the Hasher interface to use the hash function of your choice. A 32-bit Jenkins hash backend is provided by default.
  • Serialization
  • Type Definitions
  • Lower-Bound Correction: 'Linear Counting' is used when the estimated cardinality is low relative to the number of configured registers.
  • Upper-Bound Correction: Estimated cardinality is adjusted when found to be high relative to the size of the hash space.
  • An additional correction is made to compensate for 'systematic multiplicative bias' resulting from hash collisions.

Usage

Basic

import { HyperLogLog } from 'hyperlolo'

// Initialize counter
const counter = new HyperLogLog({ hasherId: 'jenkins32', precision: 12 }) // 12-bit register index = 4096 registers

// Perform 50 million insertions of 15 million distinct values
const inserts = 50000000
const distincts = 15000000

// Count values
for (var i=0; i<inserts; i++)
  counter.add(i % distincts)

// Approximate count
const count = counter.count()
console.log(`[count] estimate = ${count}, expected = ${distincts}, error = ${count - distincts}`)

Merging

Counters can be merged, allowing for tracking cardinality in distributed scenarios. Here we merge the counter created above with another whose set has a 50% overlap with the first. The estimated cardinality of the merged counter will be 1.5x the original.

// Another counter, somewhere else
const another = new HyperLogLog({ hasherId: 'jenkins32', precision: 12 })

// Count values with only half the range overlapping the the original set
const offset = Math.round(distincts / 2)
for (var i=0; i<inserts; i++)
  another.add(i % distincts + offset)

// Merge first counter with the second
const merged = counter.merge(another)
const mergedCount = merged.count()
console.log(`[merged] estimate = ${mergedCount}, expected = ${count + offset}, error = ${mergedCount - count - offset}`)

Serialization

Counters can be converted to a Buffer for persistance or distribution:

const buffer = counter.serialize()

And deserialized somewhere else:

const counter = HyperLogLog.deserialize(buffer)

The deserialized counter will inherit the options of the original.

If using a custom backend, the Hasher implementation of the counter being deserialized must be registered with the local HasherFactory.

References

  • "HyperLogLog" Wikipedia. Wikimedia Foundation, n.d. Mon. 31 Jul. 2023.

License

Everything in this repo is ISC License unless otherwise specified