npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@sinclair/parsebox

v0.8.2

Published

Parser Combinators in the TypeScript Type System

Downloads

131

Readme

npm version Build License

Install

$ npm install @sinclair/parsebox

Example

ParseBox provides combinators for parsing in Runtime and Static environments.

Runtime

import { Runtime } from '@sinclair/parsebox'

const T = Runtime.Tuple([
  Runtime.Const('X'),
  Runtime.Const('Y'),
  Runtime.Const('Z')
])

const R = Runtime.Parse(T, 'X Y Z W')               // const R = [['X', 'Y', 'Z'], ' W']

Static

import { Static } from '@sinclair/parsebox'

type T = Static.Tuple<[
  Static.Const<'X'>,
  Static.Const<'Y'>,
  Static.Const<'Z'>
]>

type R = Static.Parse<T, 'X Y Z W'>                 // type R = [['X', 'Y', 'Z'], ' W']

Overview

ParseBox is a parsing system designed to embed domain-specific languages (DSLs) within the TypeScript type system. It provides a set of type-level and runtime combinators that directly map to BNF notation, which can be used to parse content either at runtime or statically within the type system.

ParseBox is written as a parsing infrastructure for the TypeBox and LinqBox projects, enabling these libraries to support advanced runtime parsing for their respective domains.

License MIT

Contents

Combinators

ParseBox provides a set of combinators that map to notation expressible in (E)BNF (Backus-Naur Form). These combinators are named Const, Tuple, Union, Array, and Optional, respectively, to describe the types produced by each combinator. These combinators serve as building blocks for constructing parsers.

Const

The Const combinator parses for the next occurrence of the specified string. Whitespace and newline characters are ignored during parsing, unless the specified string explicitly matches those characters.

BNF

<T> ::= "X"

TypeScript

const T = Runtime.Const('X')                        // const T = {
                                                    //   type: 'Const',
                                                    //   value: 'X'
                                                    // }

const R = Runtime.Parse(T, 'X Y Z')                 // const R = ['X', ' Y Z']

Tuple

The Tuple parser matches a sequence of interior parsers. An empty tuple can be used to represent Epsilon (the empty production).

BNF

<T> ::= "X" "Y" "Z"

TypeScript

const T = Runtime.Tuple([                           // const T = {
  Runtime.Const('X'),                               //   type: 'Tuple',
  Runtime.Const('Y'),                               //   parsers: [
  Runtime.Const('Z'),                               //     { type: 'Const', value: 'X' },
])                                                  //     { type: 'Const', value: 'Y' },
                                                    //     { type: 'Const', value: 'Z' },
                                                    //   ]
                                                    // }


const R = Runtime.Parse(P, 'X Y Z W')               // const R = [['X', 'Y', 'Z'], ' W']

Union

The Union combinator parses using one of the interior parsers, attempting each in sequence until one matches.

BNF

<T> ::= "X" | "Y" | "Z"

TypeScript

const T = Runtime.Union([                           // const T = {
  Runtime.Const('X'),                               //   type: 'Union',
  Runtime.Const('Y'),                               //   parsers: [
  Runtime.Const('Z')                                //     { type: 'Const', value: 'X' },
])                                                  //     { type: 'Const', value: 'Y' },
                                                    //     { type: 'Const', value: 'Z' }
                                                    //   ]
                                                    // }

const R1 = Runtime.Parse(P, 'X E')                  // const R1 = ['X', ' E']

const R2 = Runtime.Parse(P, 'Y E')                  // const R2 = ['Y', ' E']

const R3 = Runtime.Parse(P, 'Z E')                  // const R3 = ['Z', ' E']

Array

The Array combinator will parse for zero or more the interior parser. This combinator will always return a result with an empty array given for no matches.

EBNF

<T> ::= { "X" }

TypeScript

const T = Runtime.Array(                             // const T = {
  Runtime.Const('X')                                 //   type: 'Array',
)                                                    //   parser: { type: 'Const', value: 'X' } 
                                                     // }

const R1 = Runtime.Parse(T, 'X Y Z')                 // const R1 = [['X'], ' Y Z']

const R2 = Runtime.Parse(T, 'X X X Y Z')             // const R2 = [['X', 'X', 'X'], ' Y Z']

const R3 = Runtime.Parse(T, 'Y Z')                   // const R3 = [[], 'Y Z']

Optional

The Optional combinator will parse for zero or one of the interior parser. This combinator always succeeds, returning either a tuple with one element, or zero elements for no match.

EBNF

<T> ::= [ "X" ]

TypeScript

const T = Runtime.Optional(                         // const T = {
  Runtime.Const('X')                                //   type: 'Optional',
)                                                   //   parser: { type: 'Const', value: 'X' }
                                                    // }

const R1 = Runtime.Parse(T, 'X Y Z')                // const R1 = ['X', ' Y Z']

const R2 = Runtime.Parse(T, 'Y Z')                  // const R2 = [[], 'Y Z']

Terminals

ParseBox provides combinators that can be used to parse common terminals.

Number

The Number combinator will parse a numeric literal.

const T = Runtime.Number()

// ...

const R1 = Runtime.Parse(T, '1')                    // const R1 = ['1', '']

const R2 = Runtime.Parse(T, '3.14')                 // const R2 = ['3.14', '']

const R3 = Runtime.Parse(T, '.1')                   // const R3 = ['.1', '']

const E = Runtime.Parse(T, '01')                    // const E = []

String

The String combinator will parse for quoted string literals. Thgit is combinator accepts an array of permissable quote characters. The result of this parser is the interior wrapped string.

const T = Runtime.String(['"'])

// ...

const R = Runtime.Parse(T, '"hello"')              // const R = ['hello', '']

Ident

The Ident combinator will parse for a valid JavaScript identifiers. The following parses a let statement.

<T> ::= "let" <ident> "=" <number>
const Let = Runtime.Tuple([                         //  const Let = {
  Runtime.Const('let'),                             //    type: 'Tuple',
  Runtime.Ident(),                                  //    parsers: [
  Runtime.Const('='),                               //      { type: 'Const', value: 'let' },
  Runtime.Number()                                  //      { type: 'Ident' },
])                                                  //      { type: 'Const', value: '=' },
                                                    //      { type: 'Number' },
                                                    //    ]
                                                    //  }

const R = Runtime.Parse(Let, 'let n = 10')          // const R = [[ 'let', 'n', '=', '10'], '' ]

Mapping

ParseBox supports semantic actions (i.e., mapping) for both Static and Runtime parsing. These actions allow parsed content to be transformed into complex structures, such as abstract syntax trees (ASTs). Below is an explanation of how mapping works in both Runtime and Static environments.

Runtime

In Runtime parsing, combinators can accept an optional callback function as their last argument. This callback receives the parsed elements, which can then be mapped to arbitrary return values. The following example demonstrates how a let statement is parsed and how a mapping function is used to transform the result into a syntax node.

const LetMapping = (_0: 'let', _1: string, _2: '=', _3: string) => {
  return {
    type: 'Let',
    ident: _1,
    value: parseFloat(_3)
  }
}
const Let = Runtime.Tuple([                           
  Runtime.Const('let'), // _0
  Runtime.Ident(),      // _1
  Runtime.Const('='),   // _2
  Runtime.Number()      // _3
], values => LetMapping(...values)) 

const R = Runtime.Parse(Let, 'let value = 10')      // const R = [{
                                                    //   type: 'Let',
                                                    //   ident: 'value',
                                                    //   value: 10
                                                    // }, '' ]

Static

In Static combinators, an optional type of IMapping is provided as the last generic argument. Unlike Runtime callbacks, which receive parsed values directly as parameters, Static actions use the this['input'] property to access input values, and they store the mapped results in the output property. The following example demonstrates how to implement the Let parser using Static actions.

type ParseFloat<Value extends string> = (
  Value extends `${infer Value extends number}` ? Value : never
)
interface LetMapping extends Static.IMapping {
  output: this['input'] extends ['let', infer Ident, '=', infer Value extends string] ? {
    type: 'Let',
    ident: Ident
    value: ParseFloat<Value>
  } : never
}

type Let = Static.Tuple<[
  Static.Const<'let'>, 
  Static.Ident,     
  Static.Const<'='>, 
  Static.Number
], LetMapping>

type R = Static.Parse<Let, 'let value = 10'>        // type R = [{
                                                    //   type: 'Let',
                                                    //   ident: 'value',
                                                    //   value: 10
                                                    // }, '' ]

Context

ParseBox provides a context mechanism that allows parsed content to interact with the host environment. A context value can be passed as the last argument to the Parse function, and ParseBox will propagate the context to each mapping function, enabling more dynamic parsing behavior.

Runtime

In Runtime parsing, the context is passed as the second argument to the mapping functions. This allows the parser to access external data or state during the parsing process.

import { Runtime } from '@sinclair/parsebox'

// use matched input as indexer on context
const OptionMapping = (input: 'A' | 'B' | 'C', context: Record<string, string>) => {
  return input in context 
    ? context[input] 
    : void 0
}
const Option = Runtime.Union([
  Runtime.Const('A'),
  Runtime.Const('B'),
  Runtime.Const('C')
], OptionMapping)

const R = Runtime.Parse(Option, 'A', {              // const R = ['Matched Foo', '']
  A: 'Matched Foo',
  B: 'Matched Bar',
  C: 'Matched Baz',
})

Static

In Static combinators, the context is accessible via the this['context'] property within the mapping action type.

import { Static } from '@sinclair/parsebox'

// use input as indexer on context
interface OptionMapping extends Static.IMapping {
  output: this['input'] extends keyof this['context']
    ? this['context'][this['input']]
    : never
}
type Option = Static.Union<[
  Static.Const<'A'>,
  Static.Const<'B'>,
  Static.Const<'C'>
], OptionMapping>

type R = Static.Parse<Option, 'A', {                // type R = ['Matched Foo', '']
  A: 'Matched Foo',
  B: 'Matched Bar',
  C: 'Matched Baz',
}>

Modules

ParseBox modules serve as containers for parsers, enabling recursive and mutually recursive parsing. When designing parsers, one common challenge is the order in which parsers are defined—particularly when parsers need to reference each other but haven’t been defined yet. This creates topological ordering issues. Modules resolve this problem by allowing parsers to reference each other within a contained scope, enabling mutual recursion or recursion within a single parser, even when the parsers are defined in an arbitrary order.

Modules are only applicable for Runtime parsers only as Static parsers do not encounter topological ordering issues, this is because TypeScript types are non order dependent.

Recursive List Parsing

In the following example, we define a List parser that can recursively parse sequences of Value elements. The List parser is defined as either a tuple of a Value followed by another List (recursive) or an empty tuple (base case, representing an empty list). The Value parser is a simple union of strings, and the recursive nature of List is achieved by referencing the Value and List parsers within the same module.

import { Runtime } from '@sinclair/parsebox'

const Module = new Runtime.Module({
  Value: Runtime.Union([
    Runtime.Const('X'),
    Runtime.Const('Y'),
    Runtime.Const('Z'),
  ]),

  // List: Will match Value then try to match another via a recursive ref to
  // List. This will repeat until no Value can be matched. When no Value can
  // be matched, the fall through [] (or Epsilon) will match. 
  List: Runtime.Union([
    Runtime.Tuple([Runtime.Ref('Value'), Runtime.Ref('List')]),
    Runtime.Tuple([])
  ], values => values.flat())
})

const R = Module.Parse('List', 'X Y Z Y X E')       // const R = [['X', 'Y', 'Z', 'Y', 'X'], ' E']

Advanced

ParseBox is an LL(1) parser. When building parsers for complex grammars, care must be taken to avoid infinite left recursion, which can occur if a recursive grammar refers back to itself in a way that causes the parser to enter an infinite loop. This is particularly common in expression parsers.

Expression Parsing

The following shows a reference expression parser using LL(1) techniques to avoid left recursion. The following parser respects operator precedence and grouping.

import { Static } from '@sinclair/parsebox'

// BinaryMapping: Reduces matched Term and Expr into binary expression node
type BinaryReduce<Left extends unknown, Right extends unknown[]> = (
  Right extends [infer Operator, infer Right, infer Rest extends unknown[]]
    ? { left: Left, operator: Operator, right: BinaryReduce<Right, Rest> }
    : Left
)
interface BinaryMapping extends Static.IMapping {
  output: this['input'] extends [infer Left, infer Right extends unknown[]]
    ? BinaryReduce<Left, Right>
    : never
}
// FactorMapping: Map either grouped Expr or Operand
interface FactorMapping extends Static.IMapping {
  output: ( 
    this['input'] extends ['(', infer Expr, ')'] ? Expr :
    this['input'] extends [infer Operand] ? Operand :
    never
  )
}

// ...

type Operand = Static.Ident

type Factor = Static.Union<[
  Static.Tuple<[Static.Const<'('>, Expr, Static.Const<')'>]>,
  Static.Tuple<[Operand]>,
], FactorMapping>

type TermTail = Static.Union<[
  Static.Tuple<[Static.Const<'*'>, Factor, TermTail]>,
  Static.Tuple<[Static.Const<'/'>, Factor, TermTail]>,
  Static.Tuple<[]>,
]>

type ExprTail = Static.Union<[
  Static.Tuple<[Static.Const<'+'>, Term, ExprTail]>,
  Static.Tuple<[Static.Const<'-'>, Term, ExprTail]>,
  Static.Tuple<[]>,
]>

type Term = Static.Tuple<[Factor, TermTail], BinaryMapping>

type Expr = Static.Tuple<[Term, ExprTail], BinaryMapping>

// ...

type R = Static.Parse<Expr, 'x * (y + z)'>          // type R = [{
                                                    //   left: "x";
                                                    //   operator: "*";
                                                    //   right: {
                                                    //       left: "y";
                                                    //       operator: "+";
                                                    //       right: "z";
                                                    //   };
                                                    // }, ""]

Contribute

ParseBox is open to community contribution. Please ensure you submit an open issue before submitting your pull request. The ParseBox project prefers open community discussion before accepting new features.