npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

azure-search-emulator-js

v0.6.1

Published

An unofficial javascript emulator for the Azure Cognitive Search API.

Downloads

8

Readme

azure-search-emulator-js

An unofficial javascript emulator for the Microsoft Azure Cognitive Search API.

This emulator aims to provide a fast search engine, completely written in javascript, that can run in your browser or on the server, while matching the Microsoft Azure Cognitive Search API as closely as possible.

Installation

NPM: npm i azure-search-emulator-js

Usage

Import the emulator, create an index, and start to make queries. You can look at the following example, and refer yourself to the tests and the official API documentation for more.

// Import what you need from the package.
import { Emulator, Schema, Suggester, type GeoJSONPoint } from 'azure-search-emulator-js';

// You can use types to improve typings.
type People = {
  id: string,
  fullName: string,
  phones: string[],
  addresses: {
    parts: string,
    city: string,
    state: string,
    country: string,
    kind: 'home' | 'work',
    geo: GeoJSONPoint,
    order: number,
  }[],
};

// Define a schema that matches the documents you want to store.
const peopleSchema: Schema = [
  { type: 'Edm.String', key: true, name: 'id', facetable: false },
  { type: 'Edm.String', name: 'fullName', facetable: false },
  { type: 'Collection(Edm.String)', name: 'phones', facetable: false },
  {
    type: 'Collection(Edm.ComplexType)', name: 'addresses', fields: [
      { type: 'Edm.String', name: 'parts', facetable: false },
      { type: 'Edm.String', name: 'city' },
      { type: 'Edm.String', name: 'state' },
      { type: 'Edm.String', name: 'country' },
      { type: 'Edm.String', name: 'kind' },
      { type: 'Edm.GeographyPoint', name: 'geo' },
      { type: 'Edm.Int32', name: 'order', facetable: false },
    ]
  },
];

// Define suggesters configurations. (optional)
const peopleSuggesters: Suggester[] = [
  {
    name: 'sg',
    searchMode: 'analyzingInfixMatching',
    fields: 'addresses/city, addresses/state, addresses/country',
  },
];

// Define scoring profiles. (optional)
const peopleScoringProfiles: ScoringProfile<People> = [
  {
    name: 'plain',
    text: {
      weights: {
        id: 15,
        fullName: 10,
      },
    },
  },
  {
    name: 'nearby',
    functions: [
      {
        type: 'magnitude',
        fieldName: 'addresses/order',
        boost: 2,
        magnitude: {
          boostingRangeStart: 3,
          boostingRangeEnd: 0,
        },
        interpolation: 'quadratic',
      },
      {
        type: 'tag',
        fieldName: 'addresses/kind',
        boost: 3,
        tag: {
          tagsParameter: 'kind',
        },
        interpolation: 'constant',
      },
      {
        type: 'distance',
        fieldName: 'addresses/geo',
        boost: 2,
        distance: {
          referencePointParameter: 'me',
          boostingDistance: 75,
        },
      },
      {
        type: 'distance',
        fieldName: 'addresses/geo',
        boost: 5,
        distance: {
          referencePointParameter: 'me',
          boostingDistance: 5,
        },
        interpolation: 'logarithmic'
      },
    ],
  },
]

// Create an instance of the Azure Cognitive Search Emulator.
const emulator = new Emulator();

// Create your index using your configurations.
const peopleIndex = emulator.createIndex<People>({
  name: 'people', 
  schema: peopleSchema, 
  suggesters: peopleSuggesters, 
  scoringProfiles: peopleScoringProfiles,
  defaultScoringProfile: 'plain',
});

// Populate your index with documents.
peopleIndex.postDocuments({
  value: [
    {
      '@search.action': 'upload',
      id: '1',
      fullName: 'Bob Mutton',
      phones: ['555-5550'],
      addresses: [
        {
          parts: '42 somewhere rd',
          city: 'TownsVille',
          country: 'United States',
          kind: 'work',
          geo: { type: 'Point', coordinates: [36.4945867, -78.4323851] },
        },
        {
          parts: '1 righthere drv',
          city: 'Metropolis',
          country: 'Japan',
          geo: { type: 'Point', coordinates: [35.669496, 137.4239011] },
        }
      ]
    },
    // ...
  ]
});

// Query your index.
const results = peopleIndex.search({
  search: 'bo*',
  select: ['id'],
  filter: "addresses/kind eq 'work'",
  count: true,
  facets: ['addresses/city', 'addresses/country'],
});
// {
//   '@search.count': 32,
//   '@search.facets': {
//     'addresses/city': [
//       { value: 'TownsVille', count: 3 },
//       { value: 'Metropolis', count: 12 },
//       ...
//     ],
//     'addresses/country': [
//       { value: 'United States', count: 18 },
//       { value: 'Japan', count: 6 },
//       ...
//     ],
//   },
//   value: [
//     {
//       '@search.score': 30,
//       id: '1',
//     },
//     ...
//   ]
// }

What works

Currently, the emulator supports the following stable API endpoints:

All documented OData features are supported:

Most advanced search capabilities are supported but might behave differently from the real service as they depend on full text search statistics that aren't calculated in the same manner. They are:

The emulator does schema validation through the use of strong typings and runtime validations. It is expected that the schema you use to create an index in the emulator should work without any modification when creating a real index in Azure.

Keep in mind that this is still an early prototype and is not designed to be used in production.

What does not work, yet

The emulator does not use a full text search engine as its backend, yet. This means that while it does support the Simple Lucene Query Syntax, it does not support the Full Lucene Query Syntax.

It does not support custom analyzers.
It does not support synonyms.
It does not support sharding, meaning that index coverage will always be 100%.
It does not support encryption.

Do not expect search results from this emulator to match actual search results. They should make sense, but will not, and probably will never be, the same.

What is desperately needed

While there are a fair amount of tests, coverage is pretty poor, and there is no visual demo.

This emulator should become a drop in replacement for the real service. Notably, the official javascript search client should be able to use any of the supported features of the emulator without crashing. Ideally, there should be an integration with http mocking libraries like MirageJs to use an in-browser emulator with the official client.

Documentation. Currently, people wanting to use this library are expected to know how to use the Microsoft Azure Cognitive Search API. Official documentation at: https://learn.microsoft.com/en-us/rest/api/searchservice/

Why?!

Azure Cognitive Search is crazy expensive for single developer projects, specially in their early stages, and provides a unique API with OData features that you cannot really find with competitors. It would be very difficult to start a project with a competitor and switch to Azure Cognitive Search later down the line. With an emulator like this one, you can build your whole application, all the way to RC, and switch to the official service when you feel ready to take the next step and grow your user base.

For larger projects, you might find it useful to run your entire front-end without any dependencies since some circumstances like public demos. This project has been designed with MirageJs in mind and will eventually offer a full API interceptor that is compatible with the official Azure Search javascript client library. This is now one less service to spin up on your dev machine when testing, or demoing, your application.

What's next

  • Validation against queries made by the official client.
  • Support storing/loading the index to/from disk when running on the server.
  • Replace the naive search algorithm with a more in-depth analysis to build an actual index.
  • Shard data across multiple workers (multi-threaded queries).
  • Maybe more features?

Benchmarks

describe SearchEngine
  bench large query
  => total: 2484.45ms | samples/runs: 89/100 | ops/sec: 40.25 ±3.53ms @ 3σ
     mean: 24.84ms | mode: 24.1ms | min: 21.03ms/21.49ms | max: 28.55ms/37.54ms
describe SuggestEngine
  bench large query
  => total: 1587.72ms | samples/runs: 95/100 | ops/sec: 62.98 ±4.65ms @ 3σ
     mean: 15.88ms | mode: 15ms | min: 12.29ms/12.29ms | max: 21.59ms/37.56ms
describe AutocompleteEngine
  bench large query
  => total: 1284.09ms | samples/runs: 97/100 | ops/sec: 77.88 ±2.63ms @ 3σ
     mean: 12.84ms | mode: 11ms | min: 10.13ms/10.13ms | max: 15.4ms/23.59ms

Hardware

Garuda Linux
Kernel Version: 6.0.6-zen1-1-zen (64-bit)
Processors: 32 x Intel Xeon CPU E5-2687W v2 @3.4GHz (Single Threaded: @4.0 GHz) (Released in 2013)
Memory: 64 GB of DDR3-1866 ECC

While the code does not yet make use of workers, it is planned to shard the data across multiple workers to improve performance in the future, so the exact hardware configuration might be valuable.