vacuumjs
v1.0.1
Published
A low-level node.js web page content extractor based on `parse5`.
Downloads
9
Maintainers
Readme
vacuumjs
A low-level node.js web page content extractor based on parse5
.
Usage
var extract = require('vacuumjs')
var targetDOM = parse5.parse('some page content')
// the reference dom, not optional
var refDOM = parse5.parse('reference page content')
console.log(extract(targetDOM, refDOM))
Principium
- Layout similairity
- Text density