tarantula
v0.2.1
Published
nodejs crawler/spider which provides a simple interface for crawling the Web
Downloads
6
Readme
node-tarantula
nodejs crawler/spider which provides a simple interface for crawling the Web. Its API has been inspired by crawler4j.
Quick Examples
var brain = {
legs: 8,
shouldVisit: function(uri) {
return true;
}
};
var tarantula = new Tarantula(brain);
tarantula.on('data', function (uri) {
console.info('200', uri);
});
tarantula.on('done', function() {
console.log('done');
});
tarantula.start(["http://stackoverflow.com"]);
Phantom Usage
If you would like to use the included PhantomJS plugin, you'll need to install the PhantomJS app (it is not an npm module).
- You can download PhantomJS on their website.
- It's also on popular OS Package Managers:
brew install phantomjs
,apt-get install phantomjs