puppeteer-walker
v1.2.0
Published
puppeteer crawler
Downloads
15
Maintainers
Readme
puppeteer-walker
A crawler to go through your given site in a headless chrome using puppeteer. Returns an object containing host, current path, and current DOM object
Usage
var Walker = require('puppeteer-walker')
var walker = Walker()
walker.on('end', () => console.log('finished walking'))
walker.on('error', (err) => console.log('error', err))
walker.on('page', async (page) => {
var title = await page.title()
console.log(`title: ${title}`)
})
walker.walk('https://avocado.choo.io')
API
walker = PuppeteerWalker()
Create a new walker instance.
walker.on('page', async cb(Page, push))
Listen to a page
event. Returns an instance of the puppeteer Page
Class.
The callback
has to be an Async Function.
Use the push(url)
method to add more pages into the internal walker queue.
This is useful for busting past login forms, and the like.
walker.on('error', cb(err))
Listen to error
events.
walker.on('end', cb)
Listen to an end
event.
walker.walk(url)
Start walking the URL.