npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

content-based-recommender

v1.5.0

Published

A simple content-based recommender implemented in javascript

Downloads

339

Readme

Content Based Recommender

Node.js CI NPM version

This is a simple content-based recommender implemented in javascript to illustrate the concept of content-based recommendation. Content-based recommender is a popular recommendation technique to show similar items to users, especially useful to websites for e-commerce, news content, etc.

After the recommender is trained by an array of documents, it can tell the list of documents which are more similar to the input document.

The training process involves 3 main steps:

Special thanks to the library natural helps a lot by providing a lot of NLP functionalities, such as tf-idf and word stemming.

⚠️ Note:

I haven't tested how this recommender is performing with a large dataset. I will share more results after some more testing.

Installation

npm install content-based-recommender

And then import the ContentBasedRecommender class

const ContentBasedRecommender = require('content-based-recommender')

What's New

1.5.0

  • Added trainBidirectional(collectionA, collectionB) to allow recommendations between two different datasets

1.4.0

Upgrade dependencies to fix security alerts

1.3.0

Introduce the use of unigram, bigrams and trigrams when constructing the word vector

1.2.0

Simplify the implementation by not using sorted set data structure to store the similar documents data. Also support the maxSimilarDocuments and minScore options to save memory used by the recommender.

1.1.0

Update to newer version of vector-object

Usage

Single collection

const ContentBasedRecommender = require('content-based-recommender')
const recommender = new ContentBasedRecommender({
  minScore: 0.1,
  maxSimilarDocuments: 100
});

// prepare documents data
const documents = [
  { id: '1000001', content: 'Why studying javascript is fun?' },
  { id: '1000002', content: 'The trend for javascript in machine learning' },
  { id: '1000003', content: 'The most insightful stories about JavaScript' },
  { id: '1000004', content: 'Introduction to Machine Learning' },
  { id: '1000005', content: 'Machine learning and its application' },
  { id: '1000006', content: 'Python vs Javascript, which is better?' },
  { id: '1000007', content: 'How Python saved my life?' },
  { id: '1000008', content: 'The future of Bitcoin technology' },
  { id: '1000009', content: 'Is it possible to use javascript for machine learning?' }
];

// start training
recommender.train(documents);

//get top 10 similar items to document 1000002
const similarDocuments = recommender.getSimilarDocuments('1000002', 0, 10);

console.log(similarDocuments);
/*
  the higher the score, the more similar the item is
  documents with score < 0.1 are filtered because options minScore is set to 0.1
  [
    { id: '1000004', score: 0.5114304586412038 },
    { id: '1000009', score: 0.45056313558918837 },
    { id: '1000005', score: 0.37039308109283564 },
    { id: '1000003', score: 0.10896767690747626 }
  ]
*/

Multi collection

This example shows how to automatically match posts with related tags

const ContentBasedRecommender =  require('content-based-recommender')

const posts = [
                {
                  id: '1000001',
                  content: 'Why studying javascript is fun?',
                },
                {
                  id: '1000002',
                  content: 'The trend for javascript in machine learning',
                },
                {
                  id: '1000003',
                  content: 'The most insightful stories about JavaScript',
                },
                {
                  id: '1000004',
                  content: 'Introduction to Machine Learning',
                },
                {
                  id: '1000005',
                  content: 'Machine learning and its application',
                },
                {
                  id: '1000006',
                  content: 'Python vs Javascript, which is better?',
                },
                {
                  id: '1000007',
                  content: 'How Python saved my life?',
                },
                {
                  id: '1000008',
                  content: 'The future of Bitcoin technology',
                },
                {
                  id: '1000009',
                  content: 'Is it possible to use javascript for machine learning?',
                },
              ];

const tags = [
               {
                 id: '1',
                 content: 'Javascript',
               },
               {
                 id: '2',
                 content: 'machine learning',
               },
               {
                 id: '3',
                 content: 'application',
               },
               {
                 id: '4',
                 content: 'introduction',
               },
               {
                 id: '5',
                 content: 'future',
               },
               {
                 id: '6',
                 content: 'Python',
               },
               {
                 id: '7',
                 content: 'Bitcoin',
               },
             ];

const tagMap = tags.reduce((acc, tag) => {
  acc[tag.id] = tag;
  return acc;
}, {});

const recommender = new ContentBasedRecommender();

recommender.trainBidirectional(posts, tags);

for (let post of posts) {
  const relatedTags = recommender.getSimilarDocuments(post.id);
  const tags = relatedTags.map(t => tagMap[t.id].content);
  console.log(post.content, 'related tags:', tags);
}


/*
Why studying javascript is fun? related tags: [ 'Javascript' ]
The trend for javascript in machine learning related tags: [ 'machine learning', 'Javascript' ]
The most insightful stories about JavaScript related tags: [ 'Javascript' ]
Introduction to Machine Learning related tags: [ 'machine learning', 'introduction' ]
Machine learning and its application related tags: [ 'machine learning', 'application' ]
Python vs Javascript, which is better? related tags: [ 'Python', 'Javascript' ]
How Python saved my life? related tags: [ 'Python' ]
The future of Bitcoin technology related tags: [ 'future', 'Bitcoin' ]
Is it possible to use javascript for machine learning? related tags: [ 'machine learning', 'Javascript' ]
*/

API

constructor([options])

To create the recommender instance

  • options (optional): an object to configure the recommender

Supported options:

  • maxVectorSize - to control the max size of word vector after tf-idf processing. A smaller vector size will help training performance while not affecting recommendation quality. Defaults to be 100.
  • minScore - the minimum score required to meet to consider it is a similar document. It will save more memory by filtering out documents having low scores. Allowed values range from 0 to 1. Default is 0.
  • maxSimilarDocuments - the maximum number of similar documents to keep for each document. Default is the max safe integer in javascript.
  • debug - show progress messages so can monitor the training progress

train(documents)

To tell the recommender about your documents and then it will start training itself.

  • documents - an array of object, with fields id and content

trainBidirectional(collectionA, collectionB)

Works like the normal train function, but it creates recommendations between two different collections instead of within one collection.

getSimilarDocuments(id, [start], [size])

To get an array of similar items with document id

  • id - the id of the document
  • start - the start index, inclusive. Default to be 0
  • size - the max number of similar documents to obtain. If it is omitted, the whole list after start index will be returned

It returns an array of objects, with fields id and score (ranging from 0 to 1)

export

To export the recommender as json object.

const recommender = new ContentBasedRecommender();
recommender.train(documents);

const object = recommender.export();
//can save the object to disk, database or otherwise

import(object)

To update the recommender by importing from a json object, exported by the export() method

const recommender = new ContentBasedRecommender();
recommender.import(object); // object can be loaded from disk, database or otherwise

Test

npm install
npm run test

Authors

License

MIT