js-solr-highlighter
v0.8.8
Published
A JavaScript library for highlighting HTML text based on the query in the lucene/solr query syntax
Downloads
5,955
Maintainers
Readme
js-solr-highlighter
A JavaScript library for highlighting HTML text based on the query in the lucene/solr query syntax Run in the browser or Node.js environment Built based on lucene and text-annotator The general highlighting process is:
- Derive which text to highlight from a query in the lucene syntax
- Highlight the derived text in the HTML
An example from Europe PMC
js-solr-highlighter has been used to highlight the article titles in the search results of Europe PMC, an open science platform that enables access to a worldwide collection of life science publications. An example is https://europepmc.org/search?query=blood%20AND%20TITLE%3Acancer
Basic usage
No options
var query = 'cancer AND blood'
var content = 'Platelet Volume Is Reduced In Metastasing Breast Cancer: Blood Profiles Reveal Significant Shifts.'
var highlightedContent = highlightByQuery(query, content)
// 'Platelet Volume Is Reduced In Metastasing Breast <span id="highlight-0" class="highlight">Cancer</span>: <span id="highlight-1" class="highlight">Blood</span> Profiles Reveal Significant Shifts.'
With the validFields options that specify the fields valid in the query syntax. If not specified, all like x:x will be valid fields
var query = 'TITLE:blood AND CONTENT:cell'
var content = 'A molecular map of lymph node blood vascular endothelium at single cell resolution'
var options = { validFields: ['TITLE'] }
var highlightedContent = highlightByQuery(query, content, options)
// 'A molecular map of lymph node <span id="highlight-0" class="highlight">blood</span> vascular endothelium at single cell resolution'
// "cell" will not be highlighted
With the highlightedFields options that specify the valid fields whose values will be highlighted. If not specified, the values of all valid fields will be highlighted
var query = 'TITLE:blood OR CONTENT:cell'
var content = 'A molecular map of lymph node blood vascular endothelium at single cell resolution'
var options = { validFields: ['TITLE', 'CONTENT'], highlightedFields: ['CONTENT'] }
var highlightedContent = highlightByQuery(query, content, options)
// 'A molecular map of lymph node blood vascular endothelium at single <span id="highlight-0" class="highlight">cell</span> resolution'
// "blood" will not be highlighted
Options
| Field | Type | Description | | ---- | ---- | ---- | | validFields | array | validFields are those parsed as fields.If undefined, all will be parsed as fields if they are like x:x | | highlightedFields | array | highlightedFields are those among validFields whose values will be highlighted.If undefined, the values of all valid fields will be highlighted. | | highlightAll | boolean | highlightAll indicates whether to highlight all occurances of the text or the first found occurance only.If undefined, it is true. | | highlightIdPattern | string | highlightIdPattern is the same pattern of the IDs associated with the highlights in the HTML.A highlight ID consists of highlightIdPattern followed by the index of the highlight, such as "highlight-0", "highlight-1"...If undefined, it is "highlight-". | | highlightClass | string | highlightClass is the classname of every highlight in the HTML.If undefined, it is "highlight". | | caseSensitive | boolean | caseSensitive indicates whether to ignore case when highlighting.If undefined, it is false (ignore).
Highlighting rules
| Rule | Examples |
| ---- | ---- |
| If the query has only text and has no fields, highlight each word in it. | If the query is methylation test
, methylation
and test
will be highlighted if they appear in the content. |
| If the field is valid, highlight its value. | If the query is TITLE:blood
and TITLE
is a valid field, highlight blood
if it appears in the content. |
| Do not highlight part of a word in the content. | If the query is bloo
and the content has no such word but has the word blood
, do not highlight bloo
in blood
. |
| Highlight both the text or field values that the AND
or OR
operator takes. | If the query is blood AND TITLE:cancer
and TITLE
is a valid field, highlight both blood
and cancer
in the content if they exist. |
| Do not highlight the text or field value that the NOT
operator takes. | If the query is NOT blood AND cancer
, highlight cancer
but not blood
. |
| Highlight the text or field values within parentheses. | If the query is (blood) AND (TITLE:cancer)
and TITLE
is a valid field, both blood
and cancer
will be highlighted if possible. |
| Do not highlight Solr stop words. | If the query is a theory-based study
, do not highlight a
but the other words. |
| If the text or the value of a valid field is within quotes, highlight the EXACT text/value (including stop words). | If the query is "breast cancer"
, do not highlight breast
or cancer
if it appears without the other following or being followed. |