npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@chcaa/text-search

v1.7.1

Published

An API for using elastic search

Downloads

136

Readme

Text Search

Text Search makes it easier to use Elastic Search for most common text search use cases.

Installation

  • npm install @chcaa/text-search
  • install elastic search >= 7.10 and include the ICU plugin and the Ingest Attachment Processor plugin
  • create a search-settings.js file in the root folder of the project (where package.json is located). Use the /settings/search-settings-template.js as inspiration. (copy, rename, and fill in the required fields)
  • to view a demo clone the git repository and start elastic search and run /test/web/bin/www and open a browser on localhost:3000

config (search-settings.js)

  • dataDir: this should point to the elastic search data dir
    • on linux create two sub-dirs: "textdb-index-files" and "textdb-resources" and give them read and write access both by the app and elastic search (as well as future created files and dirs). This could be done by making elastic search (elasticsearch linux user) the owner and then setting a group the app has access to. Remember to set the gid for ownership of future files and directories. (On windows the dirs will be created automatically)

Usage

This is just some simple examples to get started, see the documentation of the code for more examples and configuration options.

Search

const { Search } = require('@chcaa/text-search');

Search.initEsConnection(); // will init the ES connection using the search-settings.js file (see above). This Should only be done once pr. application.
  
let simpleSchema = await Search.createIndexIfNotExists('simple-test', {
    language: Search.language.ENGLISH,
    fields: [
        { name: 'title', type: 'text', sortable: true, boost: 5, indexed: true },
        { name: 'desc', type: 'text', sortable: true, boost: 1, indexed: true },
        { name: 'date', type: 'date', sortable: true, boost: 1, indexed: true },
        { name: 'author.title', type: 'text', sortable: true, boost: 1, indexed: true },
        { name: 'author.name', type: 'text', sortable: true, boost: 1, indexed: true },
        { name: 'author.address.zip', type: 'text', sortable: true, boost: 1, indexed: true },
        { name: 'author.address.street', type: 'text', sortable: true, boost: 1, indexed: true }
    ]
});

//await simpleSchema.dropIndex(); // drops the index but keeps resource files
//await Search.deleteIndex('simple-test'); // drops index and deletes resource files

await simpleSchema.setSynonyms([
    'aristocats, cartoons',
    'face hugger, facehugger, alien',
    'easter eg, groundhog day'
]);

await simpleSchema.index({ id: 1, body:{ title: 'Aliens', desc: '🙂', author: { title: 'Writer', name: 'john', address: { zip: '2000', street: 'Astreet' } } } });
await simpleSchema.index({ id: 2, body:{ title: 'Aristocats <3 😀', desc: '🙂', author: { title: 'Hero', name: 'Eric', address:{ zip: '3000', street: 'Bstreet'} } } });
await simpleSchema.index({ id: 3, permissions: { public: false, users: 'peter', groups: ['group1'] }, body: { title: 'Iill', desc: '🙂', author: { title: 'Bus driver', name: 'Joe', address: { zip: '4000', street: 'Cstreet' } } } });

let searchRes = await simpleSchema.find('alien', {
    pagination: { page: 1, maxResults: 3 },
    includeSource: false,
    authorization: {
        user: 'peter',
        groups: ['group1']
    },
    sorting: { field: 'date', order: 'desc' },
});

console.log(searchRes);
  • ElasticSearch can take a long time to start, which can result in problems on reboot. To await ElasticSearch use one of the following:
    await Search.waitForElasticSearchToComeOnline(timeoutMillis)
    await Search.waitForElasticSearchToComeOnlineAndBeReadyForSearch(timeoutMillis)

FileToText

Extract text from various filetypes using Tika through Elastic Search.

const { FileToText } = require('@chcaa/text-search');

FileToText.initEsConnection(); // will init the ES connection using the search-settings.js file (see installation and config notes). This Should only be done once pr. applicationSearch.initEsConnection(); // will init the ES connection using the search-settings.js file (see above). This Should only be done once pr. application

let ftt = new FileToText();

await ftt.init();
let response = await ftt.extractText([
    //'D:/temp/reddit/no-new-normal-2020/submissions/NoNewNormal.ndjson',
    'D:/temp/reddit/new_thread_gts6iv.zip',
    'D:/desktop/Vejledningsplan_MortenKüseler_Udfyldt.pdf',
    'D:/desktop/ETA.html',
    'D:/desktop/user-tokens.json',
]);
for (let text of response) {
    console.log(text.file, text.data);
}

Maintenance

From time to time the index needs to be updated. This could be if a change to the mapping is required or Search has been updated to a new version which requires changes made to the index.

Search#reindex()

Reindex the current indexed data with a new schema definition. Use this when changes to "index-time" mappings (see documentation for Search.createIndex) needs to be made.

let initialSchema = {
    language: Search.language.ENGLISH,
    fields: [
        { name: 'title', type: 'text', sortable: true, boost: 5, indexed: true }
    ]
};
let search = await Search.createIndexIfNotExists('test-index', initialSchema); // we have done this in the past

let newSchema = { // we want the laguage to be "danish" and have content assist as well, so we add that to the schema. Language change and content-assist requires reindexing
    language: Search.language.DANISH, // this has changed
    fields: [
        { name: 'title', type: 'text', sortable: true, boost: 5, indexed: true, contentAssist: true } // contentAssist is added
    ]
};

await search.reindex(newSchema);

Changes to "query-time" mappings will be applied automatically if changes is detected in Search.createIndexIfNotExists() or can be applied manually by calling Search#updateQueryTimeFieldSettings().

Search.upgrade()

When upgrading Search to a new version from npm an upgrade can be required to enable the latest features. If trying to load an index which needs upgrading to work correctly an error will be thrown saying an upgrade is required, so it is not possible to open an index which is not up to date. Search.isUpgradeIndexRequired() can be used to test if an upgrade is required before loading the index.

To upgrade simply run below code... and wait... (can take a long time if a re-index is required)

await Search.upgradeIndex('test-index');