npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

text-hoarder

v1.0.2

Published

[Text Hoarder browser extension](https://chromewebstore.google.com/u/1/detail/bjknebjiadgjchmhppdfdiddfegmcaao) comes with an optional command line companion that provides the following powerful features:

Downloads

3

Readme

Text Hoarder CLI Docs

Text Hoarder browser extension comes with an optional command line companion that provides the following powerful features:

Getting started

As a pre-requisite, you should have Node.js installed

# Replace YOUR_USERNAME with your GitHub username.
# Replace YOUR_TEXT_HOARDER_REPOSITORY with the name of the repository you
# created to store Text Hoarder's saved articles
git clone https://github.com/YOUR_USERNAME/YOUR_TEXT_HOARDER_REPOSITORY
cd YOUR_TEXT_HOARDER_REPOSITORY
# Installs Text Hoarder CLI companion
npm install
# Shows documentation for Text Hoarder CLI companion
npx text-hoarder --help

If you need help cloning the repository from the command line, see documentation from GitHub

If you are a Windows user, consider running this command in your terminal to allow Git to handle files with long file names.

git config --global core.longpaths true

Without this, "git clone" may fail if your text hoarder repository has saved articles with very long URLs

Generating Stats

You can create a webpage with comprehensive statistics about the saved articles using the npx text-hoarder stats command.

Example Usage

# Open the repository you created to store Text Hoarder's saved articles
cd YOUR_TEXT_HOARDER_REPOSITORY
# Generate stats based on all saved articles and open results in your browser.
# To see all options, run "npx text-hoarder stats --help"
npx text-hoarder stats

Example output:

Computing statistics...
1%
... trimmed ...
99%
100%
Finalizing output...

Once complete, stats.html will open in your browser:

Stats page displays a chart of saved articles per time period and a button to download stats as JSON

There are metrics for total number of articles, paragraphs, sentences, unique words, words and characters

There are tables for most commonly saved websites and most common words

Processing Text

npx text-hoarder process command optimizes saved articles for text-to-speech software (removes likely spam and advertisement lines, removes characters that are not friendly with text-to-speech software, and etc).

This command also converts markdown files to plaintext and splits large articles into smaller files to work around the max length limit in some text-to-speech tools.

By default, it processes all new articles saved since the last time this command was run.

Example Usage

# Open the repository you created to store Text Hoarder's saved articles
cd YOUR_TEXT_HOARDER_REPOSITORY
# Process all articles saved since the last time this command was run.
# To see all options, run "npx text-hoarder process --help"
npx text-hoarder process

By default, process automatically removes duplicated lines between saved articles. Why this is useful:

  • If you accidentally saved the same article twice, this step will remove the duplicate
  • It will automatically remove all the commonly repeated lines like Advertisement, or footers from websites (i.e, wired.com has a lot of lines like More Great WIRED Stories at the end of each article)
  • Some websites are not fully accessibility-complaint, leading to tools like Text Hoarder extracting some line two times in a row. This step will remove the duplicates.

If you wish to disable this, pass the --no-exclude-duplicated-lines option when running the command.

Converting Processed Text to Audio

The output of the npx text-hoarder process command can be used with various text-to-speech software. This is a great way of consuming the saved articles while doing other tasks, like walking or doing house chores.

Here is a small example script for converting the processed text files to audio using macOS's "say" utility:

# Find the directory where process outputted the files
cd processed/ && ls
# Open the directory where the processed text files are located
cd 2024-02-18
# Convert each text file that hasn't yet been converted
for f in *.txt; do
  echo "Generating $f.flac"
  # -r controls speaking rate. Run "man say" to see all options
  say -r 100 -o "$f.flac" --progress "$(cat $f)"
  # NOTE: this deletes the processed text file after it's converted to audio
  rm "$f"
done

NOTE: the above script removes the processed text file after converting it to audio. This allows to mark current progress and makes restarting the command easy if it freezes. If you do not wish this, remove the rm "$f" line.

If you are not on macOS, see some of the options for other operating systems

For best results, you should download high-quality Siri's voices. See the following section for more information.

On macOS, high-quality Siri's voices are available for text-to-speech using the say CLI command, as well as using the "Spoken Content" accessibility feature.

To download these, follow Apple's tutorial on adding a new voice. In the list of voices, search for a section titled "English (US) - Siri" (or other language, as long as the name ends with "Siri") - these are the highest quality voices available.

After downloading, make sure to select it as the default voice.

Now, when you use the say CLI command, the high-quality voice will be used.

Finding spam lines

npx text-hoarder find-spam finds commonly repeated lines in your saved articles, which are possible spam/advertisement lines that should be excluded (for example, lines like Advertisement, RECOMMENDED VIDEOS FOR YOU, etc.).

Example usage

You can run the find-spam command, then check if it reported any common undesirable lines of text, and add those to the exclude-list.txt file in the repository Text Hoarder saves articles too.

# Open the repository you created to store Text Hoarder's saved articles
cd YOUR_TEXT_HOARDER_REPOSITORY
# Report possible unwanted lines
# To see all options, run "npx text-hoarder find-spam --help"
npx text-hoarder find-spam
# Add detected spam lines to the exclude-list.txt file

Next time you run npx text-hoarder process or npx text-hoarder find-spam, the unwanted lines would be excluded automatically.

By default, text-hoarder's CLI comes with a list of common spam lines built in. See the full list. If you do not wish to use this list, pass the --no-default-exclude option when running the commands.