hyperscrape
v0.4.0
Published
Streaming high performance scraper
Downloads
13
Readme
Hyperscrape is a stream-component that accepts urls from upstream and pushes parsed pages downstream. If the input is an object that contains an url
property, the output will be added to that object. Each output object contains url
, response
, responseHeaders
and the cheerio parsed content in $
.
Hyperscrape is initialized by two arguments. First argument is the maximum number of concurrent requests allowed and the second argument contains options for the hyperquest stream. If an url
is defined in the options object, it will be passed on to the stream as the first url to process.