@chcaa/strapi-text-search
v0.12.0
Published
Integrate Strapi with Elastic Search
Downloads
31
Readme
Strapi plugin text-search
Elastic Search for strapi using the text-search api. Index and search collection-types including features such as highlighting, facetting, filters, query-language etc.
Installation
Requires strapi >= 4.1.12
In the root of the strapi project run:
npm install @chcaa/strapi-text-search
Configuration
In the root of the project add the files search-settings.js
and search-settings-strapi.js
with the following content (everything can also be included inside seach-settings.js
if preferred).
search-settings.js
(see text-search api.
for more details)
module.exports = {
elasticSearch: {
connection: {
url: new URL('http://localhost:9200'), //required
},
maxRetries: 3,
requestTimeout: 60000,
paths: {
dataDir: "PATH-TO-ES-DATA-DIR" // required
},
actions: {
copyResourcesToDataDir: true // if set to false you must to do this manually
},
},
strapi: require('./search-settings-strapi')
};
search-settings-strapi.js
const { Search } = require('@chcaa/text-search');
module.exports = {
projectName: process.env.PROJECT_NAME ?? 'strapi-text-search-dev', // required, should be unique as this is used as prefix for the indexes in ES
deleteCollectionTypeIndexes: [],
collectionTypeIndexes: [ // required
{
collectionTypeName: '', // required
fileToTextFields: [],
includeExtraFields: [],
previewAuthorizedUserRoles: [],
optimizeRelationsInIndex: true,
forceDropAndReindex: false, // force a reindex of everything, remember to disable again after use
defaultQueryOptions: {
find: {},
findOne: {}
},
beforeIndex: (entry) => {
// make changes to the entry here
},
schema: { // define the index schema for this collectionType. See https://www.npmjs.com/package/@chcaa/text-search for details
language: Search.language.ENGLISH,
fields: [ // the fields (attributes and relations to index)
// {name: 'title', type: Search.fieldType.TEXT, sortable: true, highlightFragmentCount: 0, boost: 5, similarity: Search.fieldSimilarity.LENGTH_SMALL_IMPACT},
]
}
}
]
};
Configuration Details
projectName: string [required]
- the name of this project. All indexes declared incollectionTypeIndexes
will be prefixed with this name. When having multiple projects in the same ES installation the name should be unique for the project.deleteCollectionTypeIndexes: string[]
- an array of indexes to delete. Use this to remove unused indexes.collectionTypeIndexes: object[] [required]
- Mapping of each strapi collection-type which should be indexed for search.collectionTypeName: string [required]
- the full name of the strapi collection-type e.g.api::movie.movie
.fileToTextFields: string[]
- an array of strapi field names of the typemedia
which should have the content of the file indexed. Each field name must have a corresponding field inschema.fields
named[FIELD_NAME].content
. So if we add the fieldmanuscriptFile
to this array there should be amanuscriptFile.content
field mapping inschema.fields
.includeExtraFields: string[]
- an array of strapi field names which should be included in the source object added to the index but which should not be searchable (not mapped inschema.fields
).previewAuthorizedUserRoles: string[]
- an array of strapi user roles which should be able to search and see entries which is in preview mode. For everyone to have access add thePublic
role.optimizeRelationsInIndex: boolean
- To improve performance during editing of relational data the id's of the related data is as default added to the index. Set this tofalse
to disable this feature (not recommended).forceDropAndReindex: boolean
- force reindexing of all data for this index. Remember to disable again after use.defaultQueryOptions: object
- default options to merge with the options from the client API. Options set in the client API will overwrite options defined here.find: object
- default options forfind
(used by the end-point/../entries/query
)findOne: object
- default options forfindOne
(used by the end-point/../entries/query/:id
)
beforeIndex: function(entry)
- a function which is called before the entry is indexed in Elastic Search. Changes to the entry can be made here. Return null to delete the entry from the index. (in all cases the database entry remains unchanged).schema: object
- the schema definition for the fields to index. See text-search for details.language: string
- the language to use when indexing entries. This is used for stemming etc.fields: object[]
- the fields to index. The field names should be present in the collection-type being indexed, either as an attribute of the collection-type or as a relation. (except forfileToTextFields
where only the prefix should be the name of amedia
attribute, seefileToTextFields
above).
A full example of a very simple movie index could look like this:
const { Search } = require('@chcaa/text-search');
module.exports = {
projectName: process.env.PROJECT_NAME ?? 'strapi-text-search-dev',
deleteCollectionTypeIndexes: [],
collectionTypeIndexes: [
{
collectionTypeName: 'api::movie.movie',
fileToTextFields: ['manuscriptFile'],
includeExtraFields: ['notes'],
previewAuthorizedUserRoles: ['Author'],
optimizeRelationsInIndex: true,
forceDropAndReindex: false,
defaultQueryOptions: {
find: {},
findOne: {}
},
beforeIndex: (entry) => undefined,
schema: {
language: Search.language.ENGLISH,
fields: [
{ name: 'title', type: Search.fieldType.TEXT, sortable: true, highlightFragmentCount: 0, boost: 5, similarity: Search.fieldSimilarity.LENGTH_SMALL_IMPACT },
{ name: 'budget', type: Search.fieldType.INTEGER, sortable: true, highlightFragmentCount: 0, generateFacets: { min: 10_000_000, max: 90_000_000, bucketCount: 8, includeOutOfRange: true } },
{ name: 'runtime', type: Search.fieldType.INTEGER, sortable: true, highlightFragmentCount: 0 },
{ name: 'manuscriptFile.content', type: Search.fieldType.TEXT, highlightFragmentCount: 3 },
]
}
}
]
};
Indexing Data
When configured strapi-text-search
automatically indexes on create
, update
and delete
actions performed on entries of the
mapped content-types as well as mapped relations. If data exists from before strapi-text-search
was configured or is has been disabled
use the forceDropAndReindex
flag in the settings to get everything in sync (remember to disable forceDropAndReindex
afterwards).
Endpoints
Each collection-type will have the following end-points available using the singular version of the collection-type name.
(remember so set permissions for each end-point in strapi admin panel to make them available for the intended
client users settings -> roles -> [ROLE] -> Text-search
).
GET /api/text-search/[NAME]/entries/:id
fetch the source object of the entry with the given id.
Example Request
GET /api/text-search/movie/entries/100004
Example Response
{
"data": {
"id": "100004",
"title": "War and Peace",
"budget": 72000000,
"runtime": 98,
"manuscriptFile": {
"content": "..."
}
}
}
POST /api/text-search/[NAME]/entries/query
Search the index. For all possible options and query language see text-search.
Warning The options authorization
and queryMode
cannot be set using the client API due to security restrictions.
Example Request
POST /api/text-search/movie/entries/query
{
"query": "war and peace",
"options": {
"pagination": {
"page": 1,
"maxResults": 10
},
"highlight": {
"source": true
},
"filters": [{
"fieldName": "budget",
"operator": "should",
"range": [{ "from": "100000", "to": "" }]
}],
"facets": {
"filters": [{
"fieldName": "language.name",
"value": ["en"]
}]
}
}
}
Example Response
{
"data": {
"status": "success",
"query": {
"queryString": "war and peace",
"queryMode": "standardWithFields",
"tieBreaker": 0,
"warnings": []
},
"pagination": {
"page": 1,
"maxResults": 10,
"total": {
"value": 661,
"relation": "eq"
},
"nextSearchAfter": [
175.80144,
"100595"
]
},
"results": [
{
"id": "100004",
"score": 175.80144,
"highlight": {
"title": {
"value": [
"<mark class=\"doc-significant-term\">War and Peace</mark>"
],
"hasHighlight": true,
"isFullValue": true,
"isOriginArray": false
},
"_source": {
"id": "100004",
"title": "<mark class=\"doc-significant-term\">War and Peace</mark>",
"budget": 72000000,
"runtime": 98,
"manuscriptFile": {
"content": "..."
}
}
}
}
]
}
}
POST /api/text-search/[NAME]/entries/query/:id
Fetch the entry with the specific id. For all possible options and query language see text-search.
Warning The options authorization
and queryMode
cannot be set using the client API due to security restrictions.
Example Request
POST /api/text-search/movie/entries/query/100004
{
"query": "war and peace",
"options": {
"highlight": {
"source": true
}
}
}
Example Response
{
"data": {
"status": "success",
"query": {
"queryString": "war and peace",
"queryMode": "standardWithFields",
"tieBreaker": 0,
"warnings": []
},
"result": {
"id": "100004",
"score": 175.80144,
"highlight": {
"title": {
"value": ["<mark class=\"doc-significant-term\">War and Peace</mark>"],
"hasHighlight": true,
"isFullValue": true,
"isOriginArray": false
},
"_source": {
"id": "100004",
"title": "<mark class=\"doc-significant-term\">War and Peace</mark>",
"budget": 72000000,
"runtime": 98,
"manuscriptFile": {
"content": "..."
}
}
}
}
}
}
GET /api/text-search/[NAME]/entries/query/validate
Validate the query string. The response will contain warnings about incorrect syntax which will be escaped during search.
Example Request
GET /api/text-search/movie/entries/query/validate?query=war%20and%20peace~300
Example Response
{
"data": {
"status": "warning",
"queryString": "war and peace~300",
"warnings": [{
"level": "warning",
"type": "operatorTildeIncorrectPositionOrSyntax",
"index": 9,
"endIndex": 11,
"span": 2
}]
}
}
POST /api/text-search/[NAME]/entries/query/explain
Explain the query. Gives insight into how the different parts of the query is scored and can be used
for e.g. fine-tuning boosts for each field. The same query as used in /../entries/query
can be passed in.
Warning: This end-point exposes implementation detail about authorization filtering and the full source-object (if requested), without filtering out internal fields. This information is in itself not harmful, but is probably not suited for the public.
Example Request
POST /api/text-search/movie/entries/query/explain
{
"maxDepth": 5, // the maximum depth of the explain result, how many details should be included
"query": "war and peace",
"options": {
"pagination": { "page": 1, "maxResults": 10 },
"filters": [{
"fieldName": "budget",
"operator": "should",
"range": [{ "from": "100000", "to": "" }]
}],
"facets": {
"filters": [{
"fieldName": "language.name",
"value": ["en"]
}]
}
}
}
Example Response
{
"data": {
"results": [{
"_shard": "[movie-docs-3][0]",
"_node": "6ZsvwLi8SsigtNS0HmtwZw",
"_index": "movie-docs-3",
"_type": "_doc",
"_id": "100004",
"_version": 1,
"_score": 266.90988,
"_source": {
"id": "100004",
"title": "War and Peace",
"budget": 72000000,
"runtime": 98,
"manuscriptFile": {
"content": "..."
},
"_permissions": {
"public": true,
"users": [],
"groups": []
}
},
"sort": [197.6724, "100004"],
"_explanation": "197.6724, sum of:\n 89.26548, sum of:\n 89.26548, weight(title.exact:war in 592)..."
}],
"query": {
"queryString": "war and peace",
"queryStringExpanded": "+(((title.folded:war | title:war | (title.exact:war)^5.0 ..."
}
}
}
The _explanation
property can be easily presented inside a <pre>
tag which will result in something like the following:
Strapi Service Direct Access
The service for each text-search collection-type can be accessed directly bypassing the router and controller. This can
be useful for internal use in strapi where going through end-point in som cases maybe is not desired. To get the service in
e.g. boostrap.js
(or any other place where you have access to the strapi object) do the following.
let movieTextSearchService = strapi.plugin('text-search').service('movie'); // the name of the service is the singular name of the collection-type
let searchResult = await movieTextSearchService.find('War and Peace', { highlight: { source: true }});
Methods
async find(queryString, options)
- search the indexasync findOne(id, highlightQueryString, options)
async get(id, authorization)
validateQuery(queryString)
async explainQuery(queryString, options, [maxDepth=5])
getFieldsMetaData()
async listSynonyms():string[]
- get the currently saved synonyms arrayasync updateSynonyms(synonyms)
-synonyms
should be an array of strings where each string is a comma-separated list of synonyms. e.g.["tall, high", "fast, speedy"]