npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

markdown_link_checker_sc

v0.0.134

Published

Markdown Link Checker

Downloads

2,190

Readme

markdown link checker sc

ALPHA - Mostly just attempting better handling of internal links. Probably never coming out of alpha. Bit of fun.

Markdown link checker in node. Better handling of internal link checking.

Current version only does internal link checking

Usage: markdown_link_checker_sc [options]

Options:
  -r, --root <path>                   Root directory of your source (i.e. root of github repo). Use -d as well to specify a folder if docs are not in the root, or to just
                                      run on particular subfolder. Defaults to current directory. (default: "D:\\github\\hamishwillee\\markdown_link_checker_sc")
  -d, --directory [directory]         The directory to search for markdown and html files, relative to root - such as: `en` for an English subfolder. Default empty (same
                                      as -r directory) (default: "")
  -i, --imagedir [directory]          The directory to search for all image files for global orphan checking, relative to root - such as: `assets` or `en`. Default empty
                                      if not explicitly set, and global orphan checking will not be done (default: "")
  -c, --headingAnchorSlugify [value]  Slugify approach for turning markdown headings into heading anchors. Currently support vuepress only and always (default: "vuepress")
  -t, --tryMarkdownforHTML [value]    Try a markdown file extension check if a link to HTML fails. (default: true)
  -l, --log <types...>                Types of console logs to display logs for debugging. Types: functions, todo etc.
  -f, --files <path>                  JSON file with array of files to report on (default is all files). Paths are relative relative to -d by default, but -r can be used
                                      to set a different root. (default: "")
  -s, --toc [value]                   full filename of TOC/Summary file in file system. If not specified, inferred from file with most links to other files
  -u, --site_url [value]              Site base url in form dev.example.com (used to catch absolute urls to local files)
  -o, --logtofile [value]             Output logs to file (default: true)
  -p, --interactive [value]           Interactively add errors to the ignore list at _link_checker_sc/ignore_errors.json (default: false)
  -h, --help                          display help for command

What link formats can it match

Currently matches:

  • \[Link text](url#anchor), \[Link text](url#anchor?param1=value...), \[Link text](url?param1=value...#anchor)
  • [Link text](url#anchor "title")
  • ![Image alt](url)
  • <a href="someurl#someanchor?someparams" title="sometitle">some text</a>
  • <img src="someurl" title="sometitle" />
  • <img src="someurl" title="sometitle" />
  • [reference link text][reference name], where the reference is define as [reference name]: reference_url "reference title"
    • Only supports reference name and text format - not "plain reference name" like [reference name]
    • reference must be all on one line, and can have up to three whitespaces before it on line. May not have text after reference title.

Note: It uses simple regexp. If you have a link commented out, or inside a code block that may well be captured.

There are heaps of link formats it does not match:

  • <http://www.whatever.com> - doesn't support autolinks
  • www.fred.com - Doesn't support auto-links external.
  • [![image title](imageurl)](linkurl)- Doesn't properly support a link around an image.
  • Reference links where the reference is defined across lines.

Essentially lots of the other things https://github.github.com/gfm/

The regex that drives this is very simple.

There are many other alternatives, such as: https://github.com/tcort/markdown-link-check You might also use a tokenziker or round trip to HTML using something like https://marked.js.org/using_advanced#inline in future as HTML is eaiser to extract links from.

This does catch a LOT of cases though, and is pretty quick.

Also does

  • Catches markdown files that are orphans - i.e. not linked by any file, or not linked by file which has the most links (normally the TOC file)
  • Catches orphan images
  • Allows you to specify that some errors are OK to ignore. These are stored in a file. See -i options\

TODO

Anchors that are not url escaped can trip it up.

  • You can URL escape them like this: Airframe Reference
  • BUT the comparision of the anchor is NOT url escaped, so you would need to convert back to compare.
  • URL escaping anchors and reversing for comparison is not done.

Anchors defined in id in a or span are caught. Need to check those in video, div are also caught and used in internal link checking.

Get images in/around the source files that are not linked - i.e. orphan images.

How does it work?

The way this works:

  • Specify the directory and it will search below that for all markdown/html files.
  • It loads each file, and:
    • parses for markdown and html style links for both page and image links.
    • parses headings and builds list of anchors in the page (as per vuepress) for those headings (poorly tested code)
    • parses for html elements with ids, that are also link targets.
  • Then it parses the results, comparing internal links to internal anchors to see if it can find matching files, and matching headings or anchors within files.
    • If it can't match, that is an error - which gets added to the list of errors with a type and enough detail to handle.
  • Errors are then filtered based on:
    • pages in a json file defined using -f (and possibly -r).
  • Finally the output is exported in a markdown friendly output format.

Lots more to do, but this already catches lots of internal errors.