npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

grammar-composer

v0.2.3

Published

Defines and generates parsers from composable grammar definitions. Includes advanced features like lexer-free parsing, selective packrat memoization and static analysis.

Downloads

337

Readme

Grammar composer

A library to define, build and efficiently parse context-free grammars.

  • Grammars are defined using TypeScript class declarations
  • No need for separate tokenization step. Tokenization is defined as part of the grammar via embedded Pattern objects that are internally processed through the regexp-composer regular expression library
  • The generated parser accepts raw characters as input, meaning it's a form of lexer-free, or hybrid parser, supporting contextual tokenization - that is, low-level character patterns can be specialized to different high-level parser contexts, and sub-patterns captured in the low-level regular expressions are directly embedded as part of the resulting parse tree
  • Top-down parsing (roughly equivalent to PEG parsing), with optional "packrat" caching that can be enabled or disabled for individual productions
  • Supports right-recursion, but will currently error when left-recursion is detected
  • Uses sophisticated static analysis to automatically identify and annotate optional productions
  • Provides useful parse-time error reporting, identifying the exact production involved and most likely alternatives at the failed position

Installation

npm install grammar-composer

And also the related regular expression builder package:

npm install regexp-composer

Example: XML grammar

The grammar is defined within a container class XmlGrammar. It contains a mixture of higher-level, context-free productions and lower-level, regular expression productions.

  • Context-free grammar productions are defined by anonymous functions () => ...
  • Regular expression productions are defined by pattern(...)

In this example, context-free operators are prefixed with G, and regular expression operators are prefixed with R, to avoid confusion between similarly named operators:

import * as G from 'grammar-composer'
import * as R from 'regexp-composer'

export class XmlGrammar {
	document = () => [
		G.zeroOrMore(
			G.anyOf(
				this.textFragment,
				this.openingTag,
				this.closingTag,
				this.comment,
				this.declarationTag,
			)
		)
	]

	textFragment = G.pattern([
		R.oneOrMore(R.notAnyOfChars('<'))
	])

	openingTag = () => [
		this.openingTagStart,

		G.zeroOrMore(this.attribute),

		this.tagEnd
	]

	openingTagStart = G.pattern([
		'<',

		R.possibly('?'),

		R.captureAs('tagName',
			R.oneOrMore(R.notAnyOfChars(R.whitespace, '"', "'", '?', '!', '/', '>'))
		),

		R.zeroOrMore(R.whitespace),
	])

	tagEnd = G.pattern([
		R.zeroOrMore(R.whitespace),

		R.possibly(R.anyOf('/', '?')),

		'>'
	])

	attribute = G.pattern([
		R.zeroOrMore(R.whitespace),

		R.captureAs('attributeName',
			R.oneOrMore(R.notAnyOfChars(R.whitespace, '=', '"', "'", '?', '/', '>'))
		),

		R.zeroOrMore(R.whitespace),

		R.possibly([
			'=',

			R.zeroOrMore(R.whitespace),

			quotedString,

			R.zeroOrMore(R.whitespace),
		])
	])

	closingTag = G.pattern([
		'</',

		R.zeroOrMore(R.whitespace),

		R.captureAs('tagName',
			R.oneOrMore(R.notAnyOfChars(R.whitespace, '/', '>'))
		),

		R.zeroOrMore(R.whitespace),

		'>'
	])

	declarationTag = () => [
		this.declarationTagOpening,

		G.zeroOrMore(this.declarationTagAttribute),

		this.tagEnd
	]

	declarationTagOpening = G.pattern([
		'<!',

		R.captureAs('tagName',
			R.oneOrMore(R.notAnyOfChars(R.whitespace, '"', "'", '/', '>'))
		),

		R.zeroOrMore(R.whitespace)
	])

	declarationTagAttribute = G.pattern([
		R.zeroOrMore(R.whitespace),

		R.anyOf(
			R.captureAs('attributeName',
				R.oneOrMore(R.notAnyOfChars(R.whitespace, '"', "'", '/', '!', '?', '>'))
			),

			quotedString,
		),

		R.zeroOrMore(R.whitespace),
	])

	comment = G.pattern([
		'<!--',

		R.captureAs('commentBody',
			R.zeroOrMoreNonGreedy(R.anyChar)
		),

		'-->'
	])
}

const quotedString = R.anyOf(
	[
		'"',
		R.captureAs('doubleQuotedStringContent',
			R.zeroOrMore(R.notAnyOfChars('"'))
		),
		'"'
	],
	[
		"'",
		R.captureAs('singleQuotedStringContent',
			R.zeroOrMore(R.notAnyOfChars("'"))
		),
		"'"
	],
)

Building and parsing using the XML grammar:

import { buildGrammar } from 'grammar-composer'

	const xmlString = `
<!DOCTYPE web-app>

<menu>
    <header>Adobe SVG Viewer</header>
    <item action="Open" id="Open">Open</item>
    <item action="OpenNew" id="OpenNew">Open New</item>
    <separator/>
    <item action="ZoomIn" id="ZoomIn">Zoom In</item>
    <item action="ZoomOut" id="ZoomOut">Zoom Out</item>
    <separator/>
    <item action="Quality" id="Quality">Quality</item>
    <item action="Pause" id="Pause">Pause</item>
    <item action="Mute" id="Mute">Mute</item>
    <separator/>
    <item action="Find" id="Find">Find...</item>
    <item action="FindAgain" id="FindAgain">Find Again</item>
    <item action="Copy" id="Copy">Copy</item>
</menu>
`

// Build the grammar. 'document' is the starting production.
//
// Although `XmlGrammar` is defined as a class, there's no need to instantiate it,
// just pass it as it is.
const grammar = buildGrammar(XmlGrammar, 'document')

// Parse the XML string with the built grammar
const parseTree = grammar.parse(xmlString)

The resulting parse tree looks like:

[
    {
        "name": "document",
        "startOffset": 0,
        "endOffset": 644,
        "sourceText": "\n<!DOCTYPE web-app>\n\n<menu>\n    <header>Adobe SVG Viewer</header>\n    <it
em action=\"Open\" id=\"Open\">Open</item>\n    <item action=\"OpenNew\" id=\"OpenNew\">Open New</ite
m>\n    <separator/>\n    <item action=\"ZoomIn\" id=\"ZoomIn\">Zoom In</item>\n    <item action=\"Zo
omOut\" id=\"ZoomOut\">Zoom Out</item>\n    <separator/>\n    <item action=\"Quality\" id=\"Quality\"
>Quality</item>\n    <item action=\"Pause\" id=\"Pause\">Pause</item>\n    <item action=\"Mute\" id=\
"Mute\">Mute</item>\n    <separator/>\n    <item action=\"Find\" id=\"Find\">Find...</item>\n    <ite
m action=\"FindAgain\" id=\"FindAgain\">Find Again</item>\n    <item action=\"Copy\" id=\"Copy\">Copy
</item>\n</menu>\n\n",
        "children": [
            {
                "name": "textFragment",
                "startOffset": 0,
                "endOffset": 1,
                "sourceText": "\n",
                "children": []
            },
            {
                "name": "declarationTag",
                "startOffset": 1,
                "endOffset": 19,
                "sourceText": "<!DOCTYPE web-app>",
                "children": [
                    {
                        "name": "declarationTagOpening",
                        "startOffset": 1,
                        "endOffset": 11,
                        "sourceText": "<!DOCTYPE ",
                        "children": [
                            {
                                "name": "tagName",
                                "startOffset": 3,
                                "endOffset": 10,
                                "sourceText": "DOCTYPE"
                            }
                        ]
                    },
                    {
                        "name": "declarationTagAttribute",
                        "startOffset": 11,
                        "endOffset": 18,
                        "sourceText": "web-app",
                        "children": [
                            {
                                "name": "attributeName",
                                "startOffset": 11,
                                "endOffset": 18,
                                "sourceText": "web-app"
                            }
                        ]
                    },
                    {
                        "name": "tagEnd",
                        "startOffset": 18,
                        "endOffset": 19,
                        "sourceText": ">",
                        "children": []
                    }
                ]
            },
            {
                "name": "textFragment",
                "startOffset": 19,
                "endOffset": 21,
                "sourceText": "\n\n",
                "children": []
            },
            {
                "name": "openingTag",
                "startOffset": 21,
                "endOffset": 27,
                "sourceText": "<menu>",
                "children": [
                    {
                        "name": "openingTagStart",
                        "startOffset": 21,
                        "endOffset": 26,
                        "sourceText": "<menu",
                        "children": [
                            {
                                "name": "tagName",
                                "startOffset": 22,
                                "endOffset": 26,
                                "sourceText": "menu"
                            }
                        ]
                    },
                    {
                        "name": "tagEnd",
                        "startOffset": 26,
                        "endOffset": 27,
                        "sourceText": ">",
                        "children": []
                    }
                ]
            },
            {
                "name": "textFragment",
                "startOffset": 27,
                "endOffset": 32,
                "sourceText": "\n    ",
                "children": []
            },
            {
                "name": "openingTag",
                "startOffset": 32,
                "endOffset": 40,
                "sourceText": "<header>",
                "children": [
                    {
                        "name": "openingTagStart",
                        "startOffset": 32,
                        "endOffset": 39,
                        "sourceText": "<header",
                        "children": [
                            {
                                "name": "tagName",
                                "startOffset": 33,
                                "endOffset": 39,
                                "sourceText": "header"
                            }
                        ]
                    },
                    {
                        "name": "tagEnd",
                        "startOffset": 39,
                        "endOffset": 40,
                        "sourceText": ">",
                        "children": []
                    }
                ]
            },

...

Operators

Context-free operators are mostly named similarly to the ones in regexp-composer.

zeroOrMore(grammarElement)

Match the grammar element zero or more times.

oneOrMore(grammarElement)

Match the grammar element one or more times.

anyOf(grammarElement1, grammarElement2, grammarElement3, ...)

Match any of the grammar elements. The first successful match, in order, would be accepted without trying subsequent ones.

bestOf(grammarElement1, grammarElement2, grammarElement3, ...)

Match the best grammar element. All possibilities would be tried, and the the longest match (in terms of character count) would be chosen.

possibly(grammarElement)

Optionally accept the grammar element, or skip if it doesn't match.

pattern(regexpPattern)

Accept a regular expression pattern compatible with regexp-composer Pattern type (either a simple string, pattern object, or array of pattern objects).

cached(grammarElement)

Store the result of parsing using this grammar element and reuse when it's subsequently evaluated at the same text position.

uncached(grammarElement)

Don't cache this grammar element.

Future

  • Allow raw regular expressions as part of the grammar
  • Allow to include user-provided parser functions

License

MIT