@ta11y/extract
v1.3.1
Published
Extracts content from websites for running accessibility audits with ta11y.
Downloads
32
Readme
@ta11y/extract
Extracts content from websites for running accessibility audits with ta11y.
Install
npm install --save @ta11y/extract
Usage
The easiest way to use this package is to use the CLI.
const { extract } = require('@ta11y/extract')
extract('https://en.wikipedia.org')
.then((result) => {
console.log(result.summary) // overview of results (number of urls visited, success, error)
console.log(result.results) // detailed results keyed by url
})
const { extract } = require('@ta11y/extract')
// example passing HTML directly
extract('<!doctype><html><body><h1>I ❤ accessibility</h1></body></html>')
.then((result) => {
console.log(result.summary) // overview of results (number of urls visited, success, error)
console.log(result.results) // detailed results keyed by url
// note that the result key for an HTML input is 'root' instead of url
})
API
extract
Extracts the dynamic HTML content from a website, optionally crawling the site to discover additional pages and extracting those too.
Type: function (urlOrHtml, opts): Promise
urlOrHtml
string URL or raw HTML to process.opts
object Config options.opts.browser
object Required Puppeteer browser instance to use.opts.crawl
boolean Whether or not to crawl additional pages. (optional, defaultfalse
)opts.maxDepth
number Maximum crawl depth while crawling. (optional, default16
)opts.maxVisit
number? Maximum number of pages to visit while crawling.opts.sameOrigin
boolean Whether or not to only consider crawling links with the same origin as the root URL. (optional, defaulttrue
)opts.blacklist
Array<string>? Optional blacklist of URL glob patterns to ignore.opts.whitelist
Array<string>? Optional whitelist of URL glob patterns to only include.opts.gotoOptions
object? Customize thePage.goto
navigation options.opts.viewport
object? Set the browser window's viewport dimensions and/or resolution.opts.userAgent
string? Set the browser's user-agent.opts.emulateDevice
string? Emulate a specific device type.- Use thename
property from one of the built-in devices.- Overrides
viewport
anduserAgent
.
- Overrides
opts.onNewPage
function? Optional async function called every time a new page is initialized before proceeding with extraction.
License
MIT © Saasify