npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

custom-search-lib

v1.1.5

Published

A library for advanced search functionalities including fuzzy, wildcard, prefix, suffix search, and multi-language support with special character normalization.

Downloads

1,044

Readme

Custom Search Library

Custom Search Library is a versatile JavaScript/TypeScript library for implementing advanced search functionalities. It supports fuzzy search, ranked results, prefix/suffix matching, advanced filtering, sorting, faceted searching, wildcard queries, and now supports normalization of special characters across multiple languages, all with configurable options for diverse use cases.

Features

  • Fuzzy Search: Finds results based on Levenshtein distance with configurable thresholds.
  • Ranked Fuzzy Search: Sorts results by relevance using a scoring mechanism.
  • Wildcard Search: Allows pattern matching using wildcards (*).
  • Prefix and Suffix Search: Matches strings that start or end with the query.
  • Advanced Filtering: Apply exact, range, or multi-value filters with support for AND/OR logic.
  • Sorting: Multi-level sorting with locale-aware string comparison.
  • Faceted Search: Generates facets (categories) with counts for better insights into datasets.
  • Highly Configurable: Supports case sensitivity, custom thresholds, and multi-field operations.
  • Language-Specific Normalization: Handles special characters for languages like Swedish, Danish, Norwegian, Turkish, French, Spanish, Polish, Czech, Slovak, Hungarian, Greek, and more.
  • Fuzzy Search for Persian: A dedicated fuzzy search function tailored for Persian text.

Installation

Install the library via npm:

npm install custom-search-lib

New Language Support Feature

The library now includes support for special characters from multiple languages, including:

  • German: Handles characters like ß, ä, ö, and ü.
  • Swedish, Danish, Norwegian: Normalizes ä, å, ö, ø, and æ.
  • Turkish: Converts ç, ğ, ı, ş, and ü.
  • French: Handles œ, é, è, ê, ë, à, â, ù, û, î, ï, and ç.
  • Spanish: Processes ñ, á, í, ó, and ú.
  • Polish: Normalizes ą, ć, ę, ł, ń, ó, ś, ź, and ż.
  • Czech and Slovak: Handles č, ď, ě, ň, ř, š, ť, ů, and ž.
  • Hungarian: Converts á, é, í, ó, ö, ő, ú, ü, and ű.
  • Greek: Provides transliterations for Greek characters, including α to ω.
  • Arabic and Persian:
    • Arabic: Normalizes أ, إ, آ, ؤ, ئ, ة, and ى to their standard forms.
    • Persian: Converts ي, ك, ۀ, پ, چ, ژ, and گ to their standardized forms.
  • Others: Includes mappings for characters like ý, đ, and ħ.

Example Usage of Language Normalization:

import { normalizeText } from 'custom-search-lib';

const normalized = normalizeText('Göteborg, Ærø, Crème brûlée, Ελληνικά');
console.log(normalized); // Output: 'goteborg, aero, creme brulee, ellinika'

**React TypeScript Example Implementation

  1. SearchDemo Component

The main component demonstrates the use of various search functionalities, including filtering, sorting, and faceted search.

import { useState } from 'react';
import { 
  fuzzySearch, 
  rankedFuzzySearch, 
  fuzzySearchPersian, 
  rankedFuzzySearchPersian, 
  prefixSearch, 
  suffixSearch, 
  wildcardSearch,
  applyFilters,
  sortData,
  generateFacets,
} from 'custom-search-lib'; 
import { SearchResults } from './types/SearchResults'; 
import { mockData } from './mockData/mockData'; 

const SearchDemo = () => {
  const [query, setQuery] = useState(''); 
  const [filters] = useState({ category: 'Books' }); // Example filter
  const [sortConfig] = useState<{ field: string; order: 'asc' | 'desc' }[]>([{ field: 'price', order: 'asc' }]);
  const [results, setResults] = useState<SearchResults>({
    fuzzy: [],
    ranked: [],
    persianFuzzy: [],
    persianRanked: [],
    prefix: [],
    suffix: [],
    wildcard: [],
    filtered: [],
    sorted: [],
    facets: {},
  });

  const handleSearch = () => {
    const fuzzyResults = fuzzySearch(query, mockData.map(item => item.name));
    const rankedResults = rankedFuzzySearch(query, mockData.map(item => item.name));
    const persianFuzzyResults = fuzzySearchPersian(query, mockData.map(item => item.name), { threshold: 2 });
    const persianRankedResults = rankedFuzzySearchPersian(query, mockData.map(item => item.name), { threshold: 2 });
    const prefixResults = prefixSearch(query, mockData.map(item => item.name));
    const suffixResults = suffixSearch(query, mockData.map(item => item.name));
    const wildcardResults = wildcardSearch(query, mockData.map(item => item.name));
    const filteredResults = applyFilters(mockData, filters);
    const sortedResults = sortData(mockData, sortConfig);
    const facets = generateFacets(mockData, ['category']);

    setResults({
      fuzzy: fuzzyResults,
      ranked: rankedResults,
      persianFuzzy: persianFuzzyResults,
      persianRanked: persianRankedResults,
      prefix: prefixResults,
      suffix: suffixResults,
      wildcard: wildcardResults,
      filtered: filteredResults,
      sorted: sortedResults,
      facets: facets,
    });
  };

  return (
    <div style={{ padding: '20px' }}>
      <h1>Search Demo</h1>
      <input
        type="text"
        value={query}
        onChange={(e) => setQuery(e.target.value)}
        placeholder="Enter search query"
      />
      <button onClick={handleSearch}>Search</button>

      {/* Filtering Example */}
      <div>
        <h3>Filtered Results:</h3>
        <ul>
          {results.filtered.map((item, index) => (
            <li key={index}>{item.name} - ${item.price}</li>
          ))}
        </ul>
      </div>

      {/* Sorting Example */}
      <div>
        <h3>Sorted Results:</h3>
        <ul>
          {results.sorted.map((item, index) => (
            <li key={index}>{item.name} - ${item.price}</li>
          ))}
        </ul>
      </div>

      {/* Faceted Search Example */}
      <div>
        <h3>Facets:</h3>
        <ul>
          {Object.entries(results.facets.category || {}).map(([key, count]) => (
            <li key={key}>{key}: {count}</li>
          ))}
        </ul>
      </div>

      {/* Other Search Results */}
      <div>
        <h3>Fuzzy Search:</h3>
        <ul>
          {results.fuzzy.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Ranked Fuzzy Search:</h3>
        <ul>
          {results.ranked.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Persian Fuzzy Search:</h3>
        <ul>
          {results.persianFuzzy.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Persian Ranked Fuzzy Search:</h3>
        <ul>
          {results.persianRanked.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Prefix Search:</h3>
        <ul>
          {results.prefix.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Suffix Search:</h3>
        <ul>
          {results.suffix.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Wildcard Search:</h3>
        <ul>
          {results.wildcard.map((item, index) => <li key={index}>{item}</li>)}
        </ul>
      </div>
    </div>
  );
};

export default SearchDemo;
  1. SearchResults Interface

Defines the structure for managing search results.

export interface SearchResults {
  fuzzy: string[];
  ranked: string[];
  persianFuzzy: string[];
  persianRanked: string[];
  prefix: string[];
  suffix: string[];
  wildcard: string[];
  filtered: any[];
  sorted: any[];
  facets: { [key: string]: { [value: string]: number } };
}
  1. Mock Data

Provides a large dataset for testing the search functionalities.

const generateRandomString = (length: number): string => {
  const characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
  let result = '';
  for (let i = 0; i < length; i++) {
    result += characters.charAt(Math.floor(Math.random() * characters.length));
  }
  return result;
};

const generateLargeDataset = (size: number): Array<{ name: string; category: string; price: number }> => {
  const categories = ['Books', 'Electronics', 'Clothing', 'Home Appliances', 'Toys', 'Miscellaneous'];
  const dataset: Array<{ name: string; category: string; price: number }> = [];
  for (let i = 0; i < size; i++) {
    dataset.push({
      name: generateRandomString(10),
      category: categories[Math.floor(Math.random() * categories.length)],
      price: Math.floor(Math.random() * 500),
    });
  }
  return dataset;
};

const predefinedDataset = [
  { name: 'Bicycle', category: 'Sports', price: 150 },
  { name: 'Bike', category: 'Sports', price: 200 },
  { name: 'Bicycles', category: 'Sports', price: 180 },
  { name: 'Tricycle', category: 'Sports', price: 120 },
  { name: 'Motorcycle', category: 'Vehicles', price: 1500 },
  { name: 'Hello@World', category: 'Miscellaneous', price: 50 },
  { name: 'Laptop', category: 'Electronics', price: 1000 },
  { name: 'Smartphone', category: 'Electronics', price: 800 },
  { name: 'Headphones', category: 'Electronics', price: 150 },
];

export const mockData = [...predefinedDataset, ...generateLargeDataset(5000)];
  1. Mock Data in Persian to use with fuzzySearchPersian and rankedFuzzySearchPersian in a demo app.
const generateRandomPersianString = (length: number): string => {
  const characters = 'ابپتثجچحخدذرزژسشصضطظعغفقکگلمنوهی';
  let result = '';
  for (let i = 0; i < length; i++) {
    result += characters.charAt(Math.floor(Math.random() * characters.length));
  }
  return result;
};

const generateLargePersianDataset = (size: number): Array<{ name: string; category: string; price: number }> => {
  const categories = ['کتاب‌ها', 'الکترونیک', 'پوشاک', 'لوازم خانگی', 'اسباب‌بازی', 'متفرقه'];
  const dataset: Array<{ name: string; category: string; price: number }> = [];
  for (let i = 0; i < size; i++) {
    dataset.push({
      name: generateRandomPersianString(10), // Generate a random Persian name
      category: categories[Math.floor(Math.random() * categories.length)],
      price: Math.floor(Math.random() * 500),
    });
  }
  return dataset;
};

const predefinedPersianDataset = [
  { name: 'دوچرخه', category: 'ورزش', price: 150 },
  { name: 'موتورسیکلت', category: 'وسایل نقلیه', price: 1500 },
  { name: 'کتاب ریاضی', category: 'کتاب‌ها', price: 200 },
  { name: 'لپ‌تاپ', category: 'الکترونیک', price: 1000 },
  { name: 'هدفون', category: 'الکترونیک', price: 150 },
  { name: 'اسباب‌بازی چوبی', category: 'اسباب‌بازی', price: 300 },
  { name: 'یخچال', category: 'لوازم خانگی', price: 500 },
];

export const persianMockData = [...predefinedPersianDataset, ...generateLargePersianDataset(5000)];

Performance

The library is optimized for performance but can handle large datasets efficiently:

  • Uses an optimized Levenshtein algorithm.
  • Benchmarked for datasets of up to 100,000 entries.

License

This project is licensed under the MIT License.