postlight2md
v0.5.0
Published
[](http://npmjs.com/package/postlight2md)
Downloads
9
Readme
postlight2md
postlight2md
is a command-line tool that uses the Postlight Parser to extract content from web pages and convert it into different formats such as HTML, Markdown, or plain text.
Installation
To install postlight2md
, you need to have Node.js installed. Then, you can install it globally using npm:
npm install -g postlight2md
Usage
You can use postlight2md
by providing a URL to parse. The default output format is Markdown.
postlight2md <url> [options]
Options
-f, --format <format>
: Set content type (html|markdown|text). Default ismarkdown
.-H, --header <header>
: Include custom headers in the request. Can be used multiple times.-e, --extend <extend>
: Add a custom type to the response. Can be used multiple times.-E, --extend-list <extend-list>
: Add a custom type with multiple matches. Can be used multiple times.-a, --add-extractor <extractor>
: Add a custom extractor at runtime.-o, --output [filename]
: Specify the output file name. If not provided, the title of the content will be used to generate the file name.-u, --url-file <file>
: Specify a file containing URLs to process, one per line.-c, --concurrency <number>
: Number of concurrent requests. Default is1
.
Examples
Convert a web page to Markdown:
postlight2md https://example.com
Convert a web page to HTML:
postlight2md https://example.com -f html
Include custom headers in the request:
postlight2md https://example.com -H "Authorization=Bearer token"
Add a custom type to the response:
postlight2md https://example.com -e "customType=value"
Add a custom extractor at runtime:
postlight2md https://example.com -a "./path/to/extractor.js"
Specify the output file name:
postlight2md https://example.com -o custom-filename.md
Automatically generate the output file name based on the title:
postlight2md https://example.com -o
Process URLs from a file with default concurrency:
postlight2md -u urls.txt
Process URLs from a file with specified concurrency:
postlight2md -u urls.txt -c 5
Note: When using the -u
option, the -o
option defaults to generating filenames based on the title of each URL's content. To output to the console instead, explicitly set -o
to false
:
postlight2md -u urls.txt -o false
License
This project is licensed under the MIT License.
Made with ~~❤️~~ impatience and AI 🤖.