content-to-reader
v0.0.2
Published
CLI utility for moving content from web to EPUBs. Just give it a bunch of URLs and it will extract content and merge it into an EPUB.
Downloads
1
Readme
content-to-reader
Extract meaningful content from any website and turn it into an EPUB file. Send it to your device using your Gmail account if you want.
How to install and use
- Install it globally using NPM (or any package manager)
npm i content-to-reader -g
- Create a config file in a
.yaml
format
# config.yaml
output: "./todays_news.epub"
toDevice:
deviceEmail: [email protected]
senderEmail: [email protected]
senderPassword: "your password"
pages:
- https://clickbaitnews.com/article/some_article_12msad1
- https://welldone.com/@user/10_easy_steps_to_whatever
- Run this command to create an EPUB and/or send it to your Kindle
content-to-reader create -c ./config.yaml
- Enjoy your articles
If you run into any issues refer to FAQ section below.
Use cases
Here are a few use cases and ideas that you may use as a hint
EPUB from a single URL
content-to-reader create https://welldone.com/@user/10_easy_steps_to_whatever
I want to choose what to extract
Sometimes you want to pick elements from a target website yourself or maybe default extraction didn't work well for you. Use selectors.
output: "./news.epub"
pages:
- "https://clickbaitnews.com/article/some_article_12msad1"
- url: "https://welldone.com/@user/10_easy_steps_to_whatever"
selectors:
- name: "Header"
first: ".page-content header"
- all:
".page-content .contents":
[
"h1",
"h2",
"h3",
"h4",
"h5",
"p",
"code",
{ ".custom-tip": ["p", "div", ".some-class": ["a", "p"]] },
]
- first: ".page-content .comment-section"
Selectors let you pick elements from a target website using CSS Selectors. You can select first
or all
queried elements to be included in the final EPUB.
Final EPUB will contain all of the elements found by selectors.
You can generate longer CSS Selectors without repetition using YAML's dictionaries and arrays, for example:
- all:
".page-content .contents": ["h1", "h2", "h3"]
equals
- all: ".page-content .contents .h1, .page-content .contents .h2, .page-content .contents .h3"
You can nest dictionaries in arrays recursively.
Send to Kindle
content-to-reader
allows you to use services like Amazon's "Send To Kindle":
toDevice:
deviceEmail: [email protected]
senderEmail: [email protected]
senderPassword: "your password"
pages:
- https://welldone.com/@user/10_easy_steps_to_whatever
If you've never sent to Kindle using email before, there are a few steps to follow in order to make this work.
First, whitelist your email address in Amazon then create application password for your Gmail account so you can use it in .yaml
config file. And that should do it.
Currently only Gmail's SMTP server is supported.
FAQ
Is your email address known by Amazon? If not then whitelist your email address in Amazon.
Isn't your file too big? Remember that "Send to Kindle" imposes 50mb limit.
Sometimes Amazon just rejects a file for whatever reason. You can use Calibre as a last resort and let it do its magic so Amazon accepts your file. There's a ton of material on this on the Internet.
Default extraction algorithm isn't perfect. Sometimes it fails to extract the exact content you're interested in. You can use selectors to pick relevant elements yourself. Please see "I want to choose what to extract" in Use Cases.
Currently there is no way to change this behaviour.
License
Licensed under The Prosperity Public License 3.0.0.
Contributions
Any contributions are welcome. If you have an idea or you spotted a bug feel free to open an issue or a pull request.