@flex-development/mdast-util-from-markdown
v1.0.0
Published
mdast utility to parse markdown
Downloads
1
Maintainers
Readme
mdast-util-from-markdown
mdast utility that turns markdown into a syntax tree
Contents
- What is this?
- When should I use this?
- Install
- Use
- API
- List of extensions
- Syntax
- Syntax tree
- Security
- Related
- Types
- Contribute
What is this?
This package is a utility that takes markdown input and turns it into a markdown abstract syntax tree.
This utility uses micromark
, which turns markdown into tokens, and then turns those tokens into nodes.
When should I use this?
If you want to handle syntax trees manually, use this.
When you just want to turn markdown into HTML, use micromark
instead.
For an easier time processing content, use the remark ecosystem instead.
Install
This package is ESM only.
In Node.js (version 18+) with yarn:
yarn add @flex-development/mdast-util-from-markdown
In Deno with esm.sh
:
import { fromMarkdown } from 'https://esm.sh/@flex-development/mdast-util-from-markdown'
In browsers with esm.sh
:
<script type="module">
import { fromMarkdown } from 'https://esm.sh/@flex-development/mdast-util-from-markdown'
</script>
Use
Say we have the following markdown file example.md
:
## Hello, *World*!
…and our module example.mjs
looks as follows:
import { fromMarkdown } from '@flex-development/mdast-util-from-markdown'
import { inspect } from '@flex-development/unist-util-inspect'
import { read } from 'to-vfile'
const file = await read('example.md')
const tree = fromMarkdown(String(file))
console.log(inspect(tree))
…now running node example.mjs
yields:
root[1] (1:1-2:1, 0-19)
└─0 heading[3] (1:1-1:19, 0-18)
│ depth: 2
├─0 text "Hello, " (1:4-1:11, 3-10)
├─1 emphasis[1] (1:11-1:18, 10-17)
│ └─0 text "World" (1:12-1:17, 11-16)
└─2 text "!" (1:18-1:19, 17-18)
API
fromMarkdown(value[, encoding][, options])
Turn markdown into a syntax tree.
Overloads
(value: Value | null | undefined, encoding?: Encoding | null | undefined, options?: Options) => Root
(value: Value | null | undefined, options?: Options | null | undefined) => Root
Parameters
value
(Value
|null
|undefined
) — markdown to parseencoding
(Encoding
|null
|undefined
, optional) — character encoding for whenvalue
isUint8Array
- default:
'utf8'
- default:
options
(Options
|null
|undefined
, optional) — configuration
Returns
(Root
) mdast.
compiler([options])
Create an mdast compiler.
👉 The compiler only understands complete buffering, not streaming.
Parameters
options
(Options
|null
|undefined
, optional) — configuration
Returns
(Compiler
) mdast compiler.
handles
(Handles
) Token types mapped to default token handlers.
👉 Default handlers are also exported by name. See
src/handles.ts
for more info.
CompileContext
mdast compiler context (TypeScript type).
Properties
buffer
((this: CompileContext) => undefined
) — capture some of the output dataconfig
(Config
) — configurationdata
(CompileData
) — info passed around; key/value storeenter
((this: CompileContext, node: Nodes, token: Token, onError?: OnEnterError) => undefined
) — enter a nodeexit
((this: CompileContext, token: Token, onError?: OnExitError) => undefined
) — exit a noderesume
((this: CompileContext) => string
) — stop capturing and access the output datasliceSerialize
(TokenizeContext['sliceSerialize']
) — get the string value of a tokenstack
(StackedNode[]
) — stack of nodestokenStack
(TokenTuple[]
) — stack of tokens
CompileData
Interface of tracked data (TypeScript interface).
interface CompileData {/* see code */}
When developing extensions that use more data, augment CompileData
to register custom fields:
declare module 'mdast-util-from-markdown' {
interface CompileData {
mathFlowInside?: boolean | undefined
}
}
Compiler
Turn micromark events into a syntax tree (TypeScript type).
Parameters
events
(Event[]
) — list of events
Returns
(Root
) mdast.
Config
Configuration (TypeScript type).
Properties
canContainEols
(string[]
) — token types where line endings are usedenter
(Handles
) — opening handlesexit
(Handles
) — closing handlestransforms
(Transform[]
) — tree transforms
Encoding
Encodings supported by TextEncoder
(TypeScript type).
See micromark-util-types
for more info.
type Encoding =
| 'utf-8' // always supported in node
| 'utf-16le' // always supported in node
| 'utf-16be' // not supported when ICU is disabled
| (string & {}) // everything else (depends on browser, or full ICU data)
Event
The start or end of a token amongst other events (TypeScript type).
See micromark-util-types
for more info.
type Event = ['enter' | 'exit', Token, TokenizeContext]
Extension
Change how tokens are turned into nodes (TypeScript type).
See Config
for more info.
type Extension = Partial<Config>
Fragment
Temporary node (TypeScript type).
type Fragment = Omit<mdast.Parent, 'children' | 'type'> & {
children: mdast.PhrasingContent[]
type: 'fragment'
}
Properties
children
(mdast.PhrasingContent[]
) — list of childrentype
('fragment'
) — node type
Handle
Handle a token (TypeScript type).
Parameters
this
(CompileContext
) — compiler contexttoken
(Token
) — token to handle
Returns
(undefined | void
) Nothing.
Handles
Token types mapped to handles (TypeScript type).
type Handles = Record<string, Handle>
OnEnterError
Handle the case where the right
token is open, but is closed by the left
token, or because end of file was reached
(TypeScript type).
Parameters
this
(Omit<CompileContext, 'sliceSerialize'>
) — compiler contextleft
(Token
|undefined
) — left tokenright
(Token
) — open token
Returns
(undefined
) Nothing.
OnExitError
Handle the case where the right
token is open, but is closed by exiting the left
token (TypeScript type).
Parameters
this
(Omit<CompileContext, 'sliceSerialize'>
) — compiler contextleft
(Token
) — left tokenright
(Token
) — open token
Returns
(undefined
) Nothing.
Options
Configuration options (TypeScript type).
Properties
extensions?
(micromark.Extension[]
|null
|undefined
) — extensions for this utility to change how tokens are turned into nodesfrom?
(StartPoint
|null
|undefined
) — point before first character in markdown value. node positions will be relative to this pointmdastExtensions?
((Extension | Extension[])[]
|null
|undefined
) — extensions for this utility to change how tokens are turned into nodes
Point
A location in the source document and chunk (TypeScript type).
See micromark-util-types
for more info.
StackedNode
A node on the compiler context stack (TypeScript type).
type StackedNode = Fragment | mdast.Nodes
StartPoint
Point before first character in a markdown value (TypeScript type).
type StartPoint = Omit<Point, '_bufferIndex' | '_index'>
TokenTuple
List containing an open token on the stack, and an optional error handler to use if the token isn't closed properly (TypeScript type).
type TokenTuple = [token: Token, handler: OnEnterError | undefined]
Token
A span of chunks (TypeScript interface).
See micromark-util-types
for more info.
TokenizeContext
A context object that helps with tokenizing markdown constructs (TypeScript interface).
See micromark-util-types
for more info.
Transform
Extra transform, to change the AST afterwards (TypeScript type).
Parameters
tree
(Root
) — tree to transform
Returns
(Root
| null
| undefined
| void
) New tree or nothing (in which case the current tree is used).
Value
Contents of a file.
See micromark-util-types
for more info.
type Value = Uint8Array | string
List of extensions
mdast-util-directive
— directivesmdast-util-frontmatter
— frontmatter (YAML, TOML, more)mdast-util-gfm
— GFMmdast-util-gfm-autolink-literal
— GFM autolink literalsmdast-util-gfm-footnote
— GFM footnotesmdast-util-gfm-strikethrough
— GFM strikethroughmdast-util-gfm-table
— GFM tablesmdast-util-gfm-task-list-item
— GFM task list itemssyntax-tree/mdast-util-math
— mathsyntax-tree/mdast-util-mdx
— MDXsyntax-tree/mdast-util-mdx-expression
— MDX expressionssyntax-tree/mdast-util-mdx-jsx
— MDX JSXsyntax-tree/mdast-util-mdxjs-esm
— MDX ESM
Syntax
Markdown is parsed according to CommonMark. Extensions can add support for other syntax. If you’re interested in extending markdown, more information is available in micromark’s readme.
Syntax tree
The syntax tree is mdast.
Types
This package is fully typed with TypeScript.
Security
As markdown is sometimes used for HTML, and improper use of HTML can open you up to a cross-site scripting (XSS)
attack, use of mdast-util-from-markdown
can also be unsafe.
When going to HTML, use this utility in combination with hast-util-sanitize
to make the tree safe.
Related
mdast-util-to-markdown
— serialize mdast as markdownmicromark
— parse markdownremark
— process markdown
Contribute
See CONTRIBUTING.md
.