code-blocks

v1.1.0

Published

3 years ago

Parse fenced code blocks from markdown with useful metadata

Downloads

396

0High
0Medium
0Low

shawnbot

code blocks markdown

code-blocks

Parse fenced code blocks from Markdown with useful metadata.

npm install [--save | --save-dev] code-blocks

Usage

// ES5/CommonJS
const codeBlocks = require('code-blocks')
// ES2015/ES6/Babel, etc.
import codeBlocks from 'code-blocks'

codeBlocks.fromFile('README.md')
  .then(blocks => {
    // do stuff with blocks here
  })

See the API documentation for more examples.

How it works

This library uses remark to parse Markdown into a unist tree, then finds all of the fenced code blocks. Those with a language identifier after the opening ``` or ~~~ get some additional properties.

Code block info

According to the CommonMark Spec:

The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string.

In other words, CommonMark-compliant parsers should safely ignore everything after the language identifier. That's where we can attach additional key/value pairs, which are parsed as each code block's info, as in:

```html title="A dumb example" foo=bar "x.y.z"="1 2 3"
<h1>Hello, world!</h1>
```

When parsed with code-blocks, this would yield an array with one object:

[{
  type: 'code',
  lang: 'html',
  value: '<h1>Hello, world!</h1>',
  info: {
    title: 'A dumb example',
    foo: 'bar',
    'x.y.z': '1 2 3'
  },
  title: 'A dumb example',
  source: {
    file: 'README.md',
    line: 1
  },
  position: {
    // see https://github.com/syntax-tree/unist#position
  }
}]

Node properties

The unist node objects returned by all of the block parsing functions are "enhanced" with the following properties:

lang contains only the first "word" of the info string
info is an object of key/value pairs parsed from the remainder of the info string
title is the title of the code block, as determined by this algorithm
source is an object with two keys:
- file is the path provided as the first argument to fromFile() and fromFileSync(), or as the last argument to fromString() or fromAST(). (If no file is provided, this value will be buffer.)
- line is the starting line of the code block in markdown input.

See the mdast documentation for more info about the Code nodes generated by remark, and the unist documentation for more on the underlying structures.

Code block titles

Because code blocks are often meaningless without at least some context, every block parsed gets a title according to the following algorithm:

If a title key is found in the block's info object, use that.
Otherwise, find the previous heading in the markdown and use its text.
1. If two or more code blocks share the same heading, add a numeric suffix: (2) for the second, (3) for the third, and so on.
If no previous heading is found, provide a title that describes where it comes from, in the form:
```
Code block {n} from {filename}:{line}
```
Where {n} is the 1-based index of the code block in the parsed file, {filename} is the parsed file (or buffer), and {line} is the line at which the code block starts.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

code-blocks

Usage

How it works

Code block info

Node properties

Code block titles