npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@quantleaf/probly-search

v1.2.4

Published

A lightweight full-text search engine with a fully customizable scoring function

Downloads

2

Readme

probly-search · GitHub license Coverage Status Latest Version PRs Welcome

A full-text search library, optimized for insertion speed, that provides full control over the scoring calculations.

This start initially as a port of the Node library NDX.

Demo

Recipe (title) search with 50k documents.

https://quantleaf.github.io/probly-search-demo/

Features

  • Three ways to do scoring

    • BM25 ranking function to rank matching documents. The same ranking function that is used by default in Lucene >= 6.0.0.
    • zero-to-one, a library unique scoring function that provides a normalized score that is bounded by 0 and 1. Perfect for matching titles/labels with queries.
    • Ability to fully customize your own scoring function by implenting the ScoreCalculator trait.
  • Trie based dynamic Inverted Index.

  • Multiple fields full-text indexing and searching.

  • Per-field score boosting.

  • Configurable tokenizer and term filter.

  • Free text queries with query expansion.

  • Fast allocation, but latent deletion.

Documentation

Documentation is under development. For now read the source tests.

Example

Creating an index with a document that has 2 fields. Query documents, and remove a document.

use std::collections::HashSet;
use probly_search::{
    index::{add_document_to_index, create_index, remove_document_from_index, Index},
    query::{
        query,
        score::default::{bm25, zero_to_one},
        QueryResult,
    },
};


// Create index with 2 fields
let mut index = create_index::<usize>(2);

// Create docs from a custom Doc struct
let doc_1 = Doc {
    id: 0,
    title: "abc".to_string(),
    description: "dfg".to_string(),
};

let doc_2 = Doc {
    id: 1,
    title: "dfgh".to_string(),
    description: "abcd".to_string(),
};

// Add documents to index
add_document_to_index(
    &mut index,
    &[title_extract, description_extract],
    tokenizer,
    filter,
    doc_1.id,
    doc_1.clone(),
);

add_document_to_index(
    &mut index,
    &[title_extract, description_extract],
    tokenizer,
    filter,
    doc_2.id,
    doc_2,
);

// Search, expected 2 results
let mut result = query(
    &mut index,
    &"abc",
    &mut bm25::new(),
    tokenizer,
    filter,
    &[1., 1.],
    None,
);
assert_eq!(result.len(), 2);
assert_eq!(
    result[0],
    QueryResult {
        key: 0,
        score: 0.6931471805599453
    }
);
assert_eq!(
    result[1],
    QueryResult {
        key: 1,
        score: 0.28104699650060755
    }
);

// Remove documents from index
let mut removed_docs = HashSet::new();
remove_document_from_index(&mut index, &mut removed_docs, doc_1.id);

// Vacuum to remove completely
vacuum_index(&mut index, &mut removed_docs);

// Search, expect 1 result
result = query(
    &mut index,
    &"abc",
    &mut bm25::new(),
    tokenizer,
    filter,
    &[1., 1.],
    Some(&removed_docs),
);
assert_eq!(result.len(), 1);
assert_eq!(
    result[0],
    QueryResult {
        key: 1,
        score: 0.1166450426074421
    }
);

Go through source tests in for the BM25 implementation and zero-to-one implementation for more query examples.

License

MIT