npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

remark-parse-no-trim

v8.0.4

Published

remark plugin to parse Markdown

Downloads

67,533

Readme

remark-parse

Build Coverage Downloads Size Sponsors Backers Chat

Parser for unified. Parses Markdown to mdast syntax trees. Used in the remark processor but can be used on its own as well. Can be extended to change how Markdown is parsed.

Sponsors

Install

npm:

npm install remark-parse

Use

var unified = require('unified')
var createStream = require('unified-stream')
var markdown = require('remark-parse')
var remark2rehype = require('remark-rehype')
var html = require('rehype-stringify')

var processor = unified()
  .use(markdown, {commonmark: true})
  .use(remark2rehype)
  .use(html)

process.stdin.pipe(createStream(processor)).pipe(process.stdout)

See unified for more examples »

Contents

API

See unified for API docs »

processor().use(parse[, options])

Configure the processor to read Markdown as input and process mdast syntax trees.

options

Options can be passed directly, or passed later through processor.data().

options.gfm

GFM mode (boolean, default: true).

hello ~~hi~~ world

Turns on:

options.commonmark

CommonMark mode (boolean, default: false).

This is a paragraph
    and this is also part of the preceding paragraph.

Allows:

  • Empty lines to split block quotes
  • Parentheses (( and )) around link and image titles
  • Any escaped ASCII punctuation character
  • Closing parenthesis ()) as an ordered list marker
  • URL definitions in block quotes

Disallows:

  • Indented code blocks directly following a paragraph
  • ATX headings (# Hash headings) without spacing after opening hashes or and before closing hashes
  • Setext headings (Underline headings\n---) when following a paragraph
  • Newlines in link and image titles
  • White space in link and image URLs in auto-links (links in brackets, < and >)
  • Lazy block quote continuation, lines not preceded by a greater than character (>), for lists, code, and thematic breaks
options.pedantic

⚠️ Pedantic was previously used to mimic old-style Markdown mode: no tables, no fenced code, and with many bugs. It’s currently still “working”, but please do not use it, it’ll be removed in the future.

options.blocks

Blocks (Array.<string>, default: list of block HTML elements).

<block>foo
</block>

Defines which HTML elements are seen as block level.

parse.Parser

Access to the parser, if you need it.

Extending the Parser

Typically, using transformers to manipulate a syntax tree produces the desired output. Sometimes, such as when introducing new syntactic entities with a certain precedence, interfacing with the parser is necessary.

If the remark-parse plugin is used, it adds a Parser constructor function to the processor. Other plugins can add tokenizers to its prototype to change how Markdown is parsed.

The below plugin adds a tokenizer for at-mentions.

module.exports = mentions

function mentions() {
  var Parser = this.Parser
  var tokenizers = Parser.prototype.inlineTokenizers
  var methods = Parser.prototype.inlineMethods

  // Add an inline tokenizer (defined in the following example).
  tokenizers.mention = tokenizeMention

  // Run it just before `text`.
  methods.splice(methods.indexOf('text'), 0, 'mention')
}

Parser#blockTokenizers

Map of names to tokenizers (Object.<Function>). These tokenizers (such as fencedCode, table, and paragraph) eat from the start of a value to a line ending.

See #blockMethods below for a list of methods that are included by default.

Parser#blockMethods

List of blockTokenizers names (Array.<string>). Specifies the order in which tokenizers run.

Precedence of default block methods is as follows:

  • blankLine
  • indentedCode
  • fencedCode
  • blockquote
  • atxHeading
  • thematicBreak
  • list
  • setextHeading
  • html
  • definition
  • table
  • paragraph

Parser#inlineTokenizers

Map of names to tokenizers (Object.<Function>). These tokenizers (such as url, reference, and emphasis) eat from the start of a value. To increase performance, they depend on locators.

See #inlineMethods below for a list of methods that are included by default.

Parser#inlineMethods

List of inlineTokenizers names (Array.<string>). Specifies the order in which tokenizers run.

Precedence of default inline methods is as follows:

  • escape
  • autoLink
  • url
  • email
  • html
  • link
  • reference
  • strong
  • emphasis
  • deletion
  • code
  • break
  • text

function tokenizer(eat, value, silent)

There are two types of tokenizers: block level and inline level. Both are functions, and work the same, but inline tokenizers must have a locator.

The following example shows an inline tokenizer that is added by the mentions plugin above.

tokenizeMention.notInLink = true
tokenizeMention.locator = locateMention

function tokenizeMention(eat, value, silent) {
  var match = /^@(\w+)/.exec(value)

  if (match) {
    if (silent) {
      return true
    }

    return eat(match[0])({
      type: 'link',
      url: 'https://social-network/' + match[1],
      children: [{type: 'text', value: match[0]}]
    })
  }
}

Tokenizers test whether a document starts with a certain syntactic entity. In silent mode, they return whether that test passes. In normal mode, they consume that token, a process which is called “eating”.

Locators enable inline tokenizers to function faster by providing where the next entity may occur.

Signatures
  • Node? = tokenizer(eat, value)
  • boolean? = tokenizer(eat, value, silent)
Parameters
  • eat (Function) — Eat, when applicable, an entity
  • value (string) — Value which may start an entity
  • silent (boolean, optional) — Whether to detect or consume
Properties
  • locator (Function) — Required for inline tokenizers
  • onlyAtStart (boolean) — Whether nodes can only be found at the beginning of the document
  • notInBlock (boolean) — Whether nodes cannot be in block quotes or lists
  • notInList (boolean) — Whether nodes cannot be in lists
  • notInLink (boolean) — Whether nodes cannot be in links
Returns
  • boolean?, in silent mode — whether a node can be found at the start of value
  • Node?, In normal mode — If it can be found at the start of value

tokenizer.locator(value, fromIndex)

Locators are required for inline tokenizers. Their role is to keep parsing performant.

The following example shows a locator that is added by the mentions tokenizer above.

function locateMention(value, fromIndex) {
  return value.indexOf('@', fromIndex)
}

Locators enable inline tokenizers to function faster by providing information on where the next entity may occur. Locators may be wrong, it’s OK if there actually isn’t a node to be found at the index they return.

Parameters
  • value (string) — Value which may contain an entity
  • fromIndex (number) — Position to start searching at
Returns

number — Index at which an entity may start, and -1 otherwise.

eat(subvalue)

var add = eat('foo')

Eat subvalue, which is a string at the start of the tokenized value.

Parameters
  • subvalue (string) - Value to eat
Returns

add.

add(node[, parent])

var add = eat('foo')

add({type: 'text', value: 'foo'})

Add positional information to node and add node to parent.

Parameters
  • node (Node) - Node to patch position on and to add
  • parent (Parent, optional) - Place to add node to in the syntax tree. Defaults to the currently processed node
Returns

Node — The given node.

add.test()

Get the positional information that would be patched on node by add.

Returns

Position.

add.reset(node[, parent])

add, but resets the internal position. Useful for example in lists, where the same content is first eaten for a list, and later for list items.

Parameters
  • node (Node) - Node to patch position on and insert
  • parent (Node, optional) - Place to add node to in the syntax tree. Defaults to the currently processed node
Returns

Node — The given node.

Turning off a tokenizer

In some situations, you may want to turn off a tokenizer to avoid parsing that syntactic feature.

Preferably, use the remark-disable-tokenizers plugin to turn off tokenizers.

Alternatively, this can be done by replacing the tokenizer from blockTokenizers (or blockMethods) or inlineTokenizers (or inlineMethods).

The following example turns off indented code blocks:

remarkParse.Parser.prototype.blockTokenizers.indentedCode = indentedCode

function indentedCode() {
  return true
}

Security

As Markdown is sometimes used for HTML, and improper use of HTML can open you up to a cross-site scripting (XSS) attack, use of remark can also be unsafe. When going to HTML, use remark in combination with the rehype ecosystem, and use rehype-sanitize to make the tree safe.

Use of remark plugins could also open you up to other attacks. Carefully assess each plugin and the risks involved in using them.

Contribute

See contributing.md in remarkjs/.github for ways to get started. See support.md for ways to get help. Ideas for new plugins and tools can be posted in remarkjs/ideas.

A curated list of awesome remark resources can be found in awesome remark.

This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

License

MIT © Titus Wormer