npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

node-metainspector

v2.0.0

Published

Npm package for web scraping purposes. You give it an URL, and it lets you easily get its title, links, images, description, keywords, meta tags

Downloads

2,558

Readme

status

Node-Metainspector

MetaInspector is an npm package for web scraping purposes. You give it an URL, and it lets you easily get its title, links, images, description, keywords, meta tags....

Metainspector is inspired by the Metainspector gem by jaimeiniesta

This version requires node v6 or higher, as some dependencies make use of various bits of ES6 functionality. The 1.x.x versions are compatible with v0.x - v4 releases of node, and should be used instead for older applications.

Scraped data

client.url                  # URL of the page
client.scheme               # Scheme of the page (http, https)
client.host                 # Hostname of the page (like, markupvalidator.com, without the scheme)
client.rootUrl              # Root url (scheme + host, i.e http://simple.com/)
client.title                # title of the page, as string
client.links                # array of strings, with every link found on the page as an absolute URL
client.author               # page author, as string
client.keywords             # keywords from meta tag, as array
client.charset              # page charset from meta tag, as string
client.description          # returns the meta description, or the first long paragraph if no meta description is found
client.image                # Most relevant image, if defined with og:image
client.images               # array of strings, with every img found on the page as an absolute URL
client.feeds                # Get rss or atom links in meta data fields as array
client.ogTitle              # opengraph title
client.ogDescription        # opengraph description
client.ogType               # Open Graph Object Type
client.ogUpdatedTime        # Open Graph Updated Time
client.ogLocale             # Open Graph Locale - for languages

Options

timeout - Defines the time Metainspector will wait for the url to respond in ms
maxRedirects - Specifies the number of redirects Metainspector will follow
limit - The limit in the number of bytes Metainspector will download when querying a site

Usage

var MetaInspector = require('node-metainspector');
var client = new MetaInspector("http://www.google.com", { timeout: 5000 });

client.on("fetch", function(){
    console.log("Description: " + client.description);

    console.log("Links: " + client.links.join(","));
});

client.on("error", function(err){
	console.log(err);
});

client.fetch();

TO DO

Finish implementation of the properties below:

Add absolutify url function to return all urls as an absolute url

client.internal_links     	# array of strings, with every internal link found on the page as an absolute URL
client.external_links     	# array of strings, with every external link found on the page as an absolute URL

ZOMG Fork! Thank you!

You're welcome to fork this project and send pull requests. Just remember to include tests.

Copyright (c) 2009-2012 Gabriel Cebrian, released under the MIT license

Bitdeli Badge