hunch

v0.15.0

Published

6 months ago

Compiled search for your static Markdown files.

Downloads

124

0High
0Medium
0Low

saibotsivad

hunch search 11ty lambda

🔎 Hunch

Compiled search for your static Markdown files.

Quick links to the docs: all docs, configuration, query params, results types, indexing examples, using examples.

Hunch supports these search features:

Full text lookup docs
Exact phrase matching docs
Fuzzy search docs
Include matched words for highlighting docs
Return only partial snippet docs
Search specific fields docs
Return specific fields docs
Prefix search docs
Search suggestions docs
Boosting metadata properties docs
Ranking docs
Facet Limiting docs
Facet Matching docs
Pagination docs
Stop-Words docs
Sort by alternate strategy docs

Hunch compiles a search index to store as a JSON file, which you load and use wherever you want to perform search: AWS Lambda, Cloudflare Worker, even directly in the browser!

Install

The usual ways:

npm install hunch

Generate the index

Use it as a CLI tool:

hunch
# shorthand for
hunch --config hunch.config.js

Or use it in code:

import { generate } from 'hunch'
const index = await generate({
  input: './site',
  // other options
})

Query the index

You'll need to load the index from the generated JSON file. In environments with disk access, that's could be as simple as:

import { readFile } from 'node:fs/promises'
const index = JSON.parse(await readFile('./dist/hunch.json', 'utf8'))
// or with upcoming JavaScript, eventually you could do
import index from './dist/hunch.json' assert { type: 'json' }

Then you create a search instance using Hunch, and query it:

import { hunch } from 'hunch'
const search = hunch({ index })
const results = search({ q: 'we get signal' })
/*
results = {
  items: [ ... ],
  page: { ... },
  facets: { ... },
}
*/

Overview

Many modern websites are backed by static Markdown files with some YAML-like metadata at the top, e.g. this file 2022-12-29/cats-and-dogs.md:

---
title: About Cats & Dogs
summary: Where I talk about pets.
published: 2022-12-29
tags: [ cats, dogs ]
series: Animals
---

Fancy words about cats and dogs.

As part of your deployment step, you would use Hunch to generate a pre-computed search index as a JSON file:

hunch --config hunch.config.js

A simple configuration file would specify the content folder (where the Markdown files are), the output filepath to write the JSON file, and other configuration details:

// hunch.config.js
export default {
  // Define the folder to scan.
  input: './site',
  // Define where to write the index file.
  output: './dist/hunch.json',
  // Property names of metadata to treat as "collections", like "tags" or "authors".
  facets: {
    // If it's just a flat string there's nothing to configure.
    series: true,
    // If it's more, like an array, you'll need to specify how Hunch
    // should treat the values. (See documentation for more details.)
    tags: {
      type: 'array',
    }
  },
  // All the facet fields are searchable by default, but you need
  // to specify additional searchable fields.
  searchableFields: [
    'title',
    'summary',
  ],
  // Fields that are not searchable that you want available for access
  // need to be specified. These fields are stored in the index JSON, but
  // not used by Hunch.
  storedFields: [
    'published',
  ],
}

To make a search using this index, you would create a Hunch instance with the index, and then query it:

// Load the generated JSON file in one way or another:
import { readFile } from 'node:fs/promises'
const index = JSON.parse(await readFile('./dist/hunch.json'))

// Create an instance of Hunch using that data:
import { hunch } from 'hunch'
const search = hunch({ index })

// Then query it:
const results = search({
  q: 'fancy words',
  facetMustMatch: { tags: [ 'cats' ] },
  facetMustNotMatch: { tags: [ 'rabbits' ] },
})
/*
results = {
  items: [
    {
      title: 'About Cats & Dogs',
      tags: [ 'cats', 'dogs' ],
      summary: 'Where I talk about pets.',
      published: '2022-12-29',
      series: 'Animals',
      _id: '2022-12-29/cats-and-dogs.md',
      _content: 'Fancy words about cats and dogs.',
    }
  ],
  page: {
    number: 0,
    size: 1,
    total: 1,
  },
  facets: {
    series: {
      Animals: {
        all: 3,
        search: 1,
      },
    },
    tags: {
      cats: {
        all: 5,
        search: 1
      }
      dogs: {
        all: 3,
        search: 1
      }
    },
  },
}
*/

URL Query docs

If you are using Hunch as an API with a URL query parameter interface, such as AWS Lambda, Cloudflare Worker, or even the browser, you can easily transform those query parameters into a Hunch query object:

// from the main
import { fromQuery } from 'hunch'
// or from the named export
import { fromQuery } from 'hunch/from-query'
const query = normalize({
  q: 'fancy words',
  'facet[tags]': 'cats,-rabbits',
})
/*
query = {
  q: 'fancy words',
  facetMustMatch: { tags: [ 'cats' ] },
  facetMustNotMatch: { tags: [ 'rabbits' ] },
}
*/

Additional Notes

Behind the scenes this libary uses MiniSearch for text searching, so look at that documentation if you need anything more esoteric.

⚠️ The output JSON file is an amalgamation of a MiniSearch index and other settings, optimized to save space. There is no guarantee as to the output structure or contents between Hunch versions: you must compile with the same version that you search with!

Some things left to do:

[ ] Stemming (undecided if I'll support this...)

License

Published and released under the Very Open License.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme