npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

vector-space-model-similarity

v1.0.161

Published

this module is using for calculating the cosine similarity and vector space model using tfidf

Downloads

9

Readme

CircleCI

Library for calculate vector space model using cosine similarity.. for now i test this library using indonesian languange. so i didn't testing it with english data. but i was added english lemmatize document. but maybe i will change it to porter algorithm

How to install

npm install vector-space-model-similarity --save

How to use

first we import our function

import { VSM } from 'vector-space-model-similarity

next we define documents, it's an array

const documents = ["rumah saya penuh makanan", "saya suka makan nasi", "nasi berawal dari beras"] // define our variable

next we call VSM Class and define our object from VSM class

const document = new VSM(documents); // define our object of VSM
const idf = document.getIdfVectorized(); // return an array ob object, the key is tokenize our documents and the value is the 

const query = new VSM(["sistem cerdas"], idf); // we define our object again, it's for query. and we pass our idf constant variable

const cosine = Cosine(query.getPowWeightVectorized()[0], document.getPowWeightVectorized()); // calculating cosine similarity

descriptions

example of vsm calculating using excel. Image description

VSM

import { VSM } from 'vector-space-model-similarity

VSM is a class that extends from Tfidf class. VSM has one constructor and in the constructor it has two parameter. the first parameter is an important parameter and the second is optional. it's the parameter

documents: string[], idfVector:any[] = []

documents represented about our document, and idfVector is the idf from our vector of IDF number. idfVector is important if you want to search data from query. you must pass idfVector from the idf you got from documents before. to get idfVector use this function.

getIdfVectorized will return this array. but not array of number, it's array of object. the key is the word and the value is the IDF value

Image description

getIdfVectorized() // <-- this is method from TFIDF Class.

getWeightVectorized() will return idf value. and the return is an multidimension array

Image description

getWeightVectorized() // <-- return weight of documents

getPowWeightVectorized() will return Exponent of IDF from the documents

Image description

getPowWeightVectorized() // <-- return weight of documents

Cosine

when you was got documents and query vector idf you can use this function

import { Cosine } from 'vector-space-model-similarity

Cosine library has two parameters. the first paramter is a query, and the second is a documents

queries:any[], documents:any[][] // <=== the parameters

number[] // <=== the return

after you get exponent of document idf from getPowWeightVectorized() you can use this function

query is single dimension of array, and documents is a multi dimension of array. becasue getPowWeightVectorized() return multidimension array and the query parameter required singledimension of array you must pass the first index of your array. e.g :


const document = new VSM([
    "sistem cerdas adalah kumpulan elemen",
    "adalah kumpulan elemen yang saling berinteraksi",
    "Sistem berinteraksi untuk mencapai tujuan"
]);

const idf = document.getIdfVectorized();

const query = new VSM(["sistem cerdas"], idf);

const cosine = Cosine(query.getPowWeightVectorized()[0], document.getPowWeightVectorized()); // output : [ 4.457087767265072, 0, 0.4853443577859814 ]

props

all function you can import from this package