npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

vectorstore

v0.0.4

Published

Local, cost-free vector store for text embeddings and similarity search (soon) in-browser and Node.js.

Downloads

619

Readme

Pure JavaScript implementation of a vector store with similarity search. Runs locally, in Node/Bun/Deno or even in your browser. Supports various embedding models. Open-source and free, no-cost.

  • ✅ Search for text similarities, locally, without API key, free of charge
  • ✅ Best in class performance; better than OpenAI (see "The Science" section)
  • ✅ Downloads the model automatically, caches it, executes offline afterwards
  • ✅ Runs Node.js and (soon) in the browser (large download though ~500 MB)
  • ✅ Uses the open-source nomic-embed-text-v1 text embedding model, 8192 token context window
  • ✅ Benchmarked: ~1 GiB memory usage at runtime
  • ✅ Fast! Inference < 0.05 sec. on average (per document)
  • ✅ Available as a simple API
  • ✅ Tree-shakable and side-effect free
  • ✅ Runs on Windows, Mac, Linux, CI tested
  • ✅ First class TypeScript support
  • ✅ Well tested (soon to be... ;-)
npm install
npm run demo

If you came here to understand the math behind the scenes, please head on to: https://towardsdatascience.com/text-embeddings-comprehensive-guide-afd97fce8fb5 where Mariya Mansurova wrote an excellent article on Text Embeddings.

Now let's dive deeper into metrics and open-source models: https://towardsdatascience.com/openai-vs-open-source-multilingual-embedding-models-e5ccb7c90f05

This is why I decided to use nomic-embed-text-v1. (Nomic-Embed): The model was designed by Nomic, and claims better performances than OpenAI Ada-002 and text-embedding-3-small while being only 0.55GB in size. Interestingly, the model is the first to be fully reproducible and auditable (open data and open-source training code).

https://huggingface.co/nomic-ai/nomic-embed-text-v1

  • yarn: yarn add vectorstore
  • npm: npm install vectorstore
import { createDocument, search, type Document } from "vectorstore";

// your text haystack to search for similarities ("database", "store")
const myDocuments = [
  {
    text: "foo",
    metaData: {
      id: 1,
    },
  },
  {
    text: "bar",
    metaData: {
      id: 2,
    },
  },
];

// vectorized documents to search in
const haystack: Array<Document> = [];

// first we need to turn the document text into vector emebeddings
for (const doc of myDocuments) {
  haystack.push(await createDocument(doc.text, doc.metaData));
}

// put the search string here
const needle = await createDocument("bar");

// now we can search for similarities between searchDocument and the haystack
const searchResults = await search(haystack, needle);

// search results come sorted, with a .doc (Document) and a .score
// if you want to keep track of the original text,
// just add the original text to the metaData
console.log(
  searchResults.map((result) => ({
    score: result.score,
    id: result.doc.metadata.id,
  })),
);

/** Prints:
 * [
  { score: 0.9999999999999999, id: 2 }, // "bar"
  { score: 0.3897944998952487, id: 1 }  // "foo"
]
 */

You can run this exact code as a demo when checking out this repository using git clone, run npm i followed by npm run demo

const { createDocument, search } = require('vectorstore')

// same API like ESM variant