sift-distance
v4.0.0
Published
SIFT distance algorithm
Downloads
28
Maintainers
Readme
SIFT 4
Install via npm
$ npm install sift-distance
NOTE: The major version of this module tracks the algorithm's version.
So, if you want to use SIFT 3, for example, you'd install [email protected]
, for version 3B of the SIFT algorithm [email protected]
, for version 4 [email protected]
and so on.
About
This implements the SIFT4 extended version.
API
SIFT( a, b, [options] )
- String|Buffer|Array
a
- String|Buffer|Array
b
- Object
options
- Number
maxOffset
- Number
maxDistance
- Function
tokenizer
- Function
tokenMatcher
- Function
matchEvaluator
- Function
lengthEvaluator
- Function
transpositionEvaluator
- Number
Options
Number maxOffset
The maximum largest common substring offset to be matched against one another. Defaults to 5
.
Number maxDistance
Distance at which the algorithm should stop computing the value and just exit (the values are too different anyway).
Function tokenizer( value ) -> String|Array|Buffer
- Mixed
value
Function to transform strings into vectors of tokens.
Function tokenMatcher( token1, token2 ) -> Boolean
- Mixed
token1
- Mixed
token2
Function to determine if two tokens match each other (equal).
Function matchEvaluator( token1, token2 ) -> Number
- Mixed
token1
- Mixed
token2
Function to determine the way a token match should be added to the lcs
(largest common substring). For example, a fuzzy match could be implemented.
Function lengthEvaluator( lcs ) -> Number
- Number
lcs
: largest common substring length
Function to determine the way the lcs
value is added to the lcss
. For example, longer continuous substrings could be awarded.
Function transpositionEvaluator( transpositions, lcss ) -> Number
- Number
transpositions
: number of transpositions - Number
lcss
: largest common subsequence length
Function to determine the way the number of transpositions affects the final result.