npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

retrieval-better

v1.0.0

Published

Full text search engine in js that features tunable BM25 ranking function.

Downloads

10

Readme

Build Status

Table of Contents

  1. Basic Idea and Key Benefits
  2. Deploy Full-Text Search in an App
  3. Install
  4. User Guide

1. Basic Idea and Key Benefits:

alt text

An Elasticsearch-comparable, full-text search engine using JavaScript that leverages advanced Natural Language Processing. The BM25 ranking function at the core of this project is tunable to different types of texts (e.g. tweets, scientific journals, legal writing). Key features are:

  • The JavaScript source code can be natively deployed on the server side to Node.js as well as on the client side in browser extensions, single-page apps, serverless, React Native, edge computing, and many other applications.
  • The accuracy and versatility of BM25 comes from being able to tune its parameters to specific types of documents.
  • Separates offline indexing from the time-sensitive online search.
  • Each individual NLP component, like the stemmer or the stopword list, is pluggable and carefully researched to keep at the bleeding edge. (For example, the stopword list is a confluence of the best words from three authoritative stopword lists: the Stanford CoreNLP, Journal of Machine Learning Research, and NLTK.)
  • Dockerfile and Docker image are available. Conveniently tryout the module.
  • Reasonable unit test coverage, continuous integration, and separation of concerns for each functionality.

2. Deploy Full-Text Search in an App:

demo2

Right above is a demo Express app (see MEAN stack) enhanced with full-text search capability. The easy way to try this demo is to run its docker image as below, then point browser to localhost:3000 .

docker run --rm -d -p 3000:8080 jj232/retrieval

Or you can run the command below after installing:

npm run demo2

Then, point browser to localhost:8080 .

Suggestions on deploying: For integrating the module into a simple js app, the demo right here shows this to be doable in only a few lines of code--see source code at "./demo/demo2/server.js". But for a more complex software solution, or one that relies on other languages/RTEs, the recommended way is to Dockerize this module and then expose as a microservice.

3. Install:

For the latest release:

npm install retrieval

For continuous build:

git clone https://github.com/zjohn77/retrieval.git
cd retrieval
npm install

4. User Guide:

const path = require("path");
const Retrieval = require(path.join(__dirname, "..", "..", "src", "Retrieval.js"));
const texts = require("./data/music-collection"); // Load some sample texts to search.

// 1st step: instantiate Retrieval with the tuning parameters for BM25 that attenuate term frequency.
let rt = new Retrieval(K=1.6, B=0.75);

// 2nd step: index the array of texts (strings); store the resulting document-term matrix.
rt.index(texts);

// 3rd step: search. In other words, multiply the document-term matrix and the indicator vector representing the query.
rt.search("theme and variations", 5)   // Top 5 search results for the query 'theme and variations'
  .map(item => console.log(item));
// 04 - Theme & Variations In G Minor.flac
// 17 - Rhapsody On A Theme of Paganini - Variation 18.flac
// 01 - Diabelli Variations - Theme Vivace & Variation 1 Alla Marcia Maestoso.flac
// 07 - Rhapsody On A Theme of Paganini (Introduction and 24 Variations).flac
// 10 - Diabelli Variations - Variation 10 Presto.flac

The example right above is from "./demo/demo1/scenarios.js". To run the full example, do:

npm run demo1

To run unit tests, do:

npm test