common-substrings
v3.0.1
Published
a method for finding all common strings, particularly quick for large string samples.
Downloads
1,205
Maintainers
Readme
common-substrings
A method written in Typescript, used for finding all common strings for Javascript and node.js, particularly quick for large string samples. It works in both web and node environment and it has no dependencies.
Usage
Quickstart
The easiest way to start is:
import substrings from 'common-substrings';
const result = substrings(stringArray, {
minOccurrence: 3,
minLength: 5,
});
Result is listed as an Object array, each element in the array include :
source
: the index of the labels which contain this fragment,name
: the name of the fragment,weight
: the product of the fragment length and the fragment occurrence
Example Result
If we have the array ['java', 'javascript','pythonscript']
, using the default options, we will get result array:
[
{name : 'java', source : [0,1], weight : 8},
{name : 'script', source : [1,2], weight : 10}
]
The default options are:
minLength
: 3minOccurrence
: 2
Result is fetched from leaf to node of the trie, so it is not sorted, but it will be quite easy with lodash sortBy function , for example:
const resultSortByWeight = _.sortBy(result, ['weight']);
const resultSortByLength = _.sortBy(result, substring => substring.name.length);
Algorithm
Explanation here
Implementation in Other Languages
License
The algorithm code is under The MIT License