wpdl
v2.0.2
Published
Scrape data from a WordPress instance.
Downloads
12
Maintainers
Readme
wpdl
Scrape pages, posts, images and other data from a WordPress instance using the WordPress REST API. Use simple command line arguments to clean up the scraped data.
Pre-requisites
Node.js v19 or newer (for native fetch support).
Usage examples
The following commands use the latest version of wpdl
that is published in npm. To run the script locally, clone this repo and replace npx wpdl
with npx .
.
Scrape pages and posts
npx wpdl --url https://your-wp-instance.com --pages --posts
Scrape pages and clean up the html by filtering out all img
elements and elements with the class foo
. Also remove all elements without text content. From the json files, remove all the Jetpack and Yoast SEO data.
npx wpdl --url https://your-wp-instance.com --pages --elementFilter img --classFilter foo --jsonFilter "jetpack_*" --jsonFilter "yoast_*" --removeEmptyElements
To see full usage, run
npx wpdl -h