npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

wilson-score-rank

v2.0.2

Published

A simple and light Wilson score module

Downloads

2,567

Readme

Wilson Score Interval

CircleCI Coverage Status Version GitHub license

Simple, dependency-free JavaScript implementation of Wilson score. Useful wherever you want to make a confident estimate about the actions or preferences of a general population, given a sample of data (e.g. assigning scores for ranking comments by upvotes, products by popularity, and more).

Table of Contents

Installation

$ npm i wilson-score-rank

or alternatively, you may clone the wilsonscore.js file into your project.

How To Use

Binary Ratings

const wilsonscore = require('wilson-score-rank');
// use `const wilsonScore = require('./wilsonscore');` if cloning the file

// 100 positive ratings out of 140 with default confidence level at 95%
wilsonScore.interval(100, 140); // { left: 0.6307737294693031, right: 0.7858148706178667 }

// To disable continuity correction, use `correction: false`. You may also customize the confidence level to your liking.
wilsonScore.interval(100, 140, { confidence: 0.90, correction: true }); // { left: 0.6441581643644423, right: 0.775831292147526 }

// To get just the lower limit, use:
wilsonScore.lowerBound(100, 140);   // 0.6307737294693031
wilsonScore.lowerBound(100, 140, { confidence: 0.90, correction: true });   // 0.6441581643644423

Star Ratings

// You have a rating system where users can rate products from 1 to 5 stars. A product has two ratings - one 2 star and one 3 star.

const averageRating = 2.5;
const totalRatings = 2;
const ratingMin = 1;
const ratingMax = 5;

// Just like binary ratings, you may customize the correction and confidence level.
wilsonScore.ratingInterval(averageRating, totalRatings, ratingMin, ratingMax); // { left: 1.0290765537920474, right: 4.7756183859980705 }
wilsonScore.ratingInterval(2.5, 2, 1, 5, { confidence: 0.95, correction: false }) // { left: 1.2243816140019295, right: 4.4332381555147755 }

// To get just the lower limit, use:
wilsonScore.ratingLowerBound(2.5, 2, 1, 5);   // 1.0290765537920474
wilsonScore.ratingLowerBound(2.5, 2, 1, 5, { confidence: 0.95, correction: false });   // 1.2243816140019295

Explanation

Less technical:

If you know what a sample population thinks, you can use this tool to estimate the preferences of the population at large.

Suppose your site has a population of 10,000 users. One product has ratings from 140 users (your sample size): 100 upvotes, and 40 downvotes. You want to understand how popular the product would be across the whole population. So you run wilsonScore(100, 140), which returns the result { left: 0.6307737294693031, right: 0.7858148706178667 }. Now you can estimate with 95% confidence that between 63.1% and 78.6% of total users would upvote this product.

It is common to use the lower bound of this interval (here, 63.1%) as the result, as it is the most conservative estimate of the "real" score.

For a beginner-friendly introduction to confidence intervals for population proportions, see this YouTube video.

Continuity correction can improve the score, especially for a small number of samples (n < 30).

More technical:

The Wilson score interval, developed by American mathematician Edwin Bidwell Wilson in 1927, is a confidence interval for a proportion in a statistical population. It assumes that the statistical sample used for the estimation has a binomial distribution. A binomial distribution indicates, in general, that:

  1. the experiment is repeated a fixed number of times;
  2. the experiment has two possible outcomes ('success' and 'failure');
  3. the probability of success is equal for each experiment;
  4. the trials are statistically independent.

For more, please see the Wikipedia page on the Wilson score interval and this blog post.

Comparison with other scoring methods

Using a simple calculation of score = (positive ratings) - (negative ratings) or score = average rating = (positive ratings) / (total ratings) proves to be problematic when working with smaller sample sizes, or differences in sample sizes across populations. See this blog post comparing scoring methods for details and examples.

The Wilson score interval is known for performing well given small sample sizes/extreme probabilities as compared to the normal approximation interval, because the formula accounts for uncertainties in those scenarios.

This paper offers a more technical comparison of the Wilson interval with other statistical approaches.

Use cases

Apart from sorting by rating, the Wilson score interval has a lot of potential applications! You can use the Wilson score interval anywhere you need a confident estimate for what percentage of people took or would take a specific action. I originally had run into this for bop.fm when our music platform needed to downrank track sources that were flagged "incorrect" or "bad quality".

You can even use it in cases where the data doesn't break cleanly into two specific outcomes (e.g. 1-5 star ratings), as long as you are able to creatively abstract the outcomes into two buckets (e.g. % of users who voted 4 stars and above vs % of users who didn't).

Examples:

Credits to @csjiang for the explanation provided in their PR

Additional Resources

  • http://www.vassarstats.net/prop1.html
  • http://www.goproblems.com/test/wilson/wilson.php

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help: