npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

wring

v1.0.0

Published

Extract content from websites using CSS Selectors and XPath

Downloads

65

Readme

wring

Installation

You can install wring using npm:

$ npm install --global wring

Wring utilizes PhantomJS for some of its commands. To use these, install it using your system package manager by running something like brew install phantomjs on OS X, or apt-get install phantomjs on Ubuntu. You can make sure it's on your PATH by running phantomjs -v.

Alternatively, you can install a version which automatically downloads PhantomJS binaries for your system:

$ npm install --global wring-with-phantomjs

Usage

Here is a simple example which prints contents of the matching element (uses Cheerio under the hood):

$ wring text 'https://www.google.com/finance/converter?a=1&from=EUR&to=USD' '#currency_converter_result'
1 EUR = 1.0940 USD

# You can use the first letter of command as a shortcut
$ wring t http://randomfunfacts.com/ 'i'
No president of the United States was an only child.

You can also use jQuery specific selectors such as :contains():

$ wring t 'https://en.wikipedia.org/wiki/List_of_songs_recorded_by_Taylor_Swift' 'tr:contains("The Hunger Games") th:first-child'
"Eyes Open"
"Safe & Sound"

wring html prints outerHTMLof matching elements. Let's try it, this time using an XPath expression:

$ wring html "http://news.ycombinator.com" "//td[@class='title']/a[starts-with(@href,'http')]"
<a href="http://eftimov.net/postgresql-indexes-first-principles">PostgreSQL Indexes: First principles</a>
<a href="http://inference-review.com/article/doing-mathematics-differently">Doing Mathematics Differently</a>
<a href="https://blog.chartmogul.com/api-based-saas/">The rise of the API-based SaaS</a>
<a href="https://github.com/tallesl/Rich-Hickey-fanclub">Rich Hickey Fanclub</a>
...

First argument of a command specifies its input, which can be a URL, path to a file, HTML string, or - to read the page source from stdin:

# read from file
$ curl 'http://www.purescript.org/' > page.html
$ wring t page.html '.intro h2'
PureScript is a small strongly typed programming language that compiles to JavaScript.

# read from string
$ wring text '<div class="foo">Hello</div>' '.foo'
Hello

# read from stdin
$ curl -s 'http://www.merriam-webster.com/word-of-the-day' | wring text - '.word-and-pronunciation h1'
keelhaul

Using with PhantomJS

Prefixing a command with phantomjs or p will run it using jQuery inside a real web browser context. You can use this if you are having compatibility problems with the commands above, but the real utility comes from being able to scrape dynamically generated content:

$ wring p t '<title>Foo</title> <script>document.title = "Bar";</script>' 'title'
Bar

# compare it to the non-phantomjs invocation below
$ wring t '<title>Foo</title> <script>document.title = "Bar";</script>' 'title'
Foo

wring eval lets you evaluate JavaScript inside any page. Calling wring('str') will write to terminal. You can pass any number of .js file paths, URLs, and JS expressions as script arguments and they will get executed in given order:

$ wring eval 'http://ipfs.io' 'wring(document.title)'
IPFS is a new peer-to-peer hypermedia protocol.

# you can load and use third party libraries:
$ wring e 'http://ipfs.io' 'http://cdn.jsdelivr.net/lodash/4.5.1/lodash.js' 'wring(_.kebabCase(document.title))'
ipfs-is-a-new-peer-to-peer-hypermedia-protocol

You can also use a trick to make self contained scripts.

Here is a contrived example which loads Hacker News homepage, loads lodash, sorts posts by their score, and prints the top 5:

#!/bin/sh
":" //; exec wring eval "https://news.ycombinator.com" "https://cdn.jsdelivr.net/lodash/4.5.1/lodash.js" "$0"

var posts = _.map(
  document.querySelectorAll(".votelinks + .title > a"),
  function(el) {
    return el.textContent + "\n" + el.href;
  })

var scores = _.map(
  document.querySelectorAll(".score"),
  function (el) {
    return parseInt(el.textContent, 10);
  })

_(posts)
  .zipWith(scores, function (text, score) {
    return { text: text, score: score };
  })
  .orderBy("score", "desc")
  .take(5)
  .forEach(function (item) {
    wring(item.text + "\n");
  })
# after saving the source above to `wring_hn.js` you can run it like this
$ chmod +x wring_hn.js
$ ./wring_hn.js
Raspberry Pi 3 Model B confirmed, with onboard BT LE and WiFi
https://apps.fcc.gov/oetcf/eas/reports/...

After fifteen years of downtime, the MetaFilter gopher server is back
http://metatalk.metafilter.com/24019/...
...

Last command to cover is wring shot, which renders a screenshot of first matching element and saves it to a file:

$ wring shot 'https://www.google.com/finance?q=GOOG' '#price-panel' goog.png
wring: Saved to goog.png

Resulting goog.png will contain something like this:

GOOG

Development

# Install Node.js dependencies:
$ npm install

# Install PureScript dependencies:
$ bower install

# Build `wring.js` and `phantom-main.js`:
$ npm run build

# Run tests:
$ npm test

# Compile & run using Pulp (https://github.com/bodil/pulp):
$ pulp run text '<b>foo</b>' 'b'

License

MIT