npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

word-salience

v0.0.12

Published

measure the salience / importance of words in a text document -- based on the frequency of the words in the document, versus their frequency in English

Downloads

21

Readme

word-salience

  • return the most salient / important words from a piece of text
  • or directly from a wordcounts object similar to what is returned by word-count-frequency
  • salience = word frequency / frequency in english * # word appearances - similar to tf-idf
  • based on frequencies from norvig-frequencies-stemmed

Installation

npm i word-salience

Usage


var ws = require('word-salience');

//excerpt from "The Complete Book of Cheese, by Robert Carlton Brown"
// https://www.gutenberg.org/cache/epub/14293/pg14293.txt

var theText = "Cheese market day in a town in the north of Holland. All the\n" +
    "cheese-fanciers are out, thumping the cannon-ball Edams and the\n" +
    "millstone Goudas with their bare red knuckles, plugging in with a\n" +
    "hollow steel tool for samples. In Holland the business of judging a\n" +
    "crumb of cheese has been taken with great seriousness for centuries.\n" +
    "The abracadabra is comparable to that of the wine-taster or\n" +
    "tea-taster. These Edamers have the trained ear of music-masters and,\n" +
    "merely by knuckle-rapping, can tell down to an air pocket left by a\n" +
    "gas bubble just how mature the interior is.\n" +
    "\n" +
    "The connoisseurs use gingerbread as a mouth-freshener; and I, too,\n" +
    "that sunny day among the Edams, kept my gingerbread handy and made my\n" +
    "way from one fine cheese to another, trying out generous plugs from\n" +
    "the heaped cannon balls that looked like the ammunition dump at\n" +
    "Antietam.\n" +
    "\n" +
    "I remember another market day, this time in Lucerne. All morning I\n" +
    "stocked up on good Schweizerkäse and better Gruyère. For lunch I had\n" +
    "cheese salad. All around me the farmers were rolling two-hundred-pound\n" +
    "Emmentalers, bigger than oxcart wheels. I sat in a little café,\n" +
    "absorbing cheese and cheese lore in equal quantities. I learned that a\n" +
    "prize cheese must be chock-full of equal-sized eyes, the gas holes\n" +
    "produced during fermentation. They must glisten like polished bar\n" +
    "glass. The cheese itself must be of a light, lemonish yellow. Its\n" +
    "flavor must be nutlike. (Nuts and Swiss cheese complement each other\n" +
    "as subtly as Gorgonzola and a ripe banana.) There are, I learned,\n" +
    "\"blind\" Swiss cheeses as well, but the million-eyed ones are better."



//full params:
var returnUnstemmedWords = true; //default true - return the words in their unstemmed format [the first unstemmed version encountered is the one returned]. note this is a "superficial" parameter -- all scoring is identical, only the labels are changed.
var doPostMultiply = true; //default true - multiple word scores by number of word appearances

console.log(ws.getSalientWords(theText, returnUnstemmedWords, doPostMultiply).slice(0,5));

/*
//returns list of words sorted by salience 

//unstemmed
[
  [ 'Edams', 0.27351362035510285 ],
  [ 'taster', 0.07597600565419523 ],
  [ 'Cheese', 0.06048089923787909 ],
  [ 'gingerbread', 0.0467544650179663 ],
  [ 'fanciers', 0.03039040226167809 ]
]


//stemmed [note that scores are identical]
[
  [ 'edam', 0.27357297182063456 ],
  [ 'taster', 0.07599249217239848 ],
  [ 'chees', 0.06049402337408037 ],
  [ 'gingerbread', 0.04676461056762983 ],
  [ 'fancier', 0.030396996868959397 ]
]


//with post-multiply = false [post-multiply multiples score by number of word appearances, similar to TF-IDF]
[
  [ 'edam', 0.13407875855107848 ],
  [ 'taster', 0.0558661493962827 ],
  [ 'fancier', 0.04469291951702616 ],
  [ 'millston', 0.04469291951702616 ],
  [ 'gouda', 0.04469291951702616 ]
]

 */


//get the data as an object instead of array:
console.log(ws.getSalientWordsObject(theText));

/*
{
  Cheese: 0.06048089923787909,
  market: 0.00011119795924507169,
  day: 0.0001068245666126788,
  town: 0.00008512717720358008,
  north: 0.00004101268861225114,
  Holland: 0.0027134287733641153,
  ...
}
 */

//get salience from result of [npm word-count-frequency]
//you must give it the unstemmed text if you want it to return unstemmed words. otherwise it returns the stemmed words.
getSalientWordsObjectFromWordCount(wordCountResult, doReturnUnstemmed, unstemmedText, doPostMultiply);

See Also

stonks