npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

ltr

v1.2.1

Published

Command-line text segmenter.

Downloads

23

Readme

ltr

A simple command-line text segmenter that uses the Intl.Segmenter API to split text into characters, words and sentences.

It takes cues from standard Unix command-line tools such as wc, uniq, and sort.

Getting started

ltr runs in Node.js and can be installed globally with npm:

npm install -g ltr

You can also run it without installing it first, using npx:

npx ltr --help

Usage

ltr [command] [file1, [file2, …]]

ltr accepts one or more input files, or uses the standard input (stdin) when no files are provided. You can also concatenate stdin to other input files by using the - (dash) operand.

General options:

  • -h, --help.
  • -v, --version.

Available commands:

  • ltr chars — extract graphemes;
  • ltr words — extract words;
  • ltr sentences — extract sentences.

The tool returns one value per line.

Options

-l, --locale

By default, ltr works with the current locale. An explicit locale can be specified.

ltr sentences --locale=ro my-doc.txt

-u, --unique

Return unique values, removing any duplicates.

ltr words --unique my-doc.txt

-i, --ignore-case

Ignore case when performing operations. Causes values to be returned in lowercase.

ltr words --ignore-case my-doc.txt

-I, --ignore-accents

Ignore diacritical marks when performing operations. Causes values to be returned without diacritical marks.

ltr words --ignore-accents my-doc.txt

-c, --count

Count occurences of each unique value.

ltr words --count my-doc.txt

-t, --total

Count total occurrences. The option implies --count.

ltr words --total my-doc.txt

-s, --sort

Sort the values.

ltr words --sort my-doc.txt

When --count is present, values are sorted by occurrences, from most frequent to least. Otherwise values are sorted alphabetically in ascending order.

-r, --reverse

Reverse the order of the values. It can be used to reverse the sorting order, but can also be used on its own to list values in the reverse order of occurrence.

ltr words --sort --reverse my-doc.txt

Working with HTML and Markdown

Although you can feed HMTL and Markdown to ltr, the list of returned value will have the added noise of markup constructs.

You can convert HTML or Markdown to plain text with trimd before calling ltr:

# Using Markdown:
trimd demarkdown my-post.md | ltr words --count --total

# Using HTML:
trimd demarkup my-page.html | ltr words --count --total

Furhtermore, when using HTML documents you may want to focus on the main part of the content to reduce the interference of ancillary page content. You can use hred to extract the content of a single element:

# Using HTML, just the <main> content:
cat my-page.html | trimd demarkup | ltr words --count --total