document-distance
v0.0.4
Published
Compute how similar two documents are, in the sense of: how many words overlap in these documents.
Downloads
3
Maintainers
Readme
node-document-distance
Document Distance Problem - d(D1, D2)
The document distance problem has applications in finding similar documents, detecting duplicates (Wikipedia mirrors and Google) and plagiarism, and also in web search (D2 = query).
The idea is to define distance in terms of shared words.
Install
$ npm install --save document-distance
Usage
var fs = require('fs');
var documentDistance = require('document-distance');
fs.readFile('file1.txt', function(err, dataFile1) {
if(err) throw err;
fs.readFile('file2.txt', function(err, dataFile2) {
if(err) throw err;
documentDistance(dataFile1.toString(), dataFile2.toString()); //=> 0.8410686705679303
});
});
API
documentDistance(document1, document2)
document1
Required
Type: string
document2
Required
Type: string
License
MIT © Vinícius do Carmo