clarity-pattern-parser
v10.0.2
Published
Parsing Library for Typescript and Javascript.
Downloads
10,826
Readme
Installation
npm install clarity-pattern-parser
Overview
Leaf Patterns
- Literal
- Regex
Composing Patterns
- And
- Or
- Repeat
- Reference
- Not
The Not
pattern is a negative look ahead and used with the And
pattern. This will be illustrated in more detail within the Not
pattern section.
Literal
The Literal
pattern uses a string literal to match patterns.
import { Literal } from "clarity-pattern-parser";
const firstName = new Literal("first-name", "John");
const { ast } = firstName.exec("John");
ast.toJson(2)
{
"type": "literal",
"name": "first-name",
"value": "John",
"firstIndex": 0,
"lastIndex": 3,
"startIndex": 0,
"endIndex": 4,
"children": []
}
Regex
The Regex
pattern uses regular expressions to match patterns.
import { Regex } from "clarity-pattern-parser";
const digits = new Regex("digits", "\\d+");
const { ast } = digits.exec("12");
ast.toJson(2);
{
"type": "regex",
"name": "digits",
"value": "12",
"firstIndex": 0,
"lastIndex": 1,
"startIndex": 0,
"endIndex": 2,
"children": []
}
Regex Caveats
Do not use "^" at the beginning or "$" at the end of your regular expression. If you are creating a regular expression that is concerned about the beginning and end of the text you should probably just use a regular expression.
And
The And
pattern is a way to make a sequence pattern. And
accepts all other patterns as children.
import { And, Literal } from "clarity-pattern-parser";
const jane = new Literal("first-name", "Jane");
const space = new Literal("space", " ");
const doe = new Literal("last-name", "Doe");
const fullName = new And("full-name", [jane, space, doe]);
const { ast } = fullName.exec("Jane Doe");
ast.toJson(2); // Look Below for output
{
"type": "and",
"name": "full-name",
"value": "Jane Doe",
"firstIndex": 0,
"lastIndex": 7,
"startIndex": 0,
"endIndex": 8,
"children": [
{
"type": "literal",
"name": "first-name",
"value": "Jane",
"firstIndex": 0,
"lastIndex": 3,
"startIndex": 0,
"endIndex": 4,
"children": []
},
{
"type": "and",
"name": "space",
"value": " ",
"firstIndex": 4,
"lastIndex": 4,
"startIndex": 4,
"endIndex": 5,
"children": []
},
{
"type": "and",
"name": "last-name",
"value": "Doe",
"firstIndex": 5,
"lastIndex": 7,
"startIndex": 5,
"endIndex": 8,
"children": []
}
]
}
Or
The Or
pattern matches any of the patterns given to the constructor.
import { Or, Literal } from "clarity-pattern-parser";
const jane = new Literal("jane", "Jane");
const john = new Literal("john", "John");
const firstName = new Or("first-name", [jane, john]);
const { ast } = firstName.exec("Jane");
ast.toJson(2)
{
"type": "literal",
"name": "jane",
"value": "Jane",
"firstIndex": 0,
"lastIndex": 3,
"startIndex": 0,
"endIndex": 4,
"children": []
}
Repeat
The Repeat
patterns allows you to match repeating patterns with, or without a divider.
For example you may want to match a pattern like so.
1,2,3
Here is the code to do so.
import { Repeat, Literal, Regex } from "clarity-pattern-parser";
const digit = new Regex("digit", "\\d+");
const commaDivider = new Literal("comma", ",");
const numberList = new Repeat("number-list", digit, commaDivider);
const ast = numberList.exec("1,2,3").ast;
ast.type // ==> "repeat"
ast.name // ==> "number-list"
ast.value // ==> "1,2,3
ast.children[0].value // ==> "1"
ast.children[1].value // ==> ","
ast.children[2].value // ==> "2"
ast.children[3].value // ==> ","
ast.children[4].value // ==> "3"
If there is a trailing divider without the repeating pattern, it will not include the trailing divider as part of the result. Here is an example.
import { Repeat, Literal, Regex } from "clarity-pattern-parser";
const digit = new Regex("digit", "\\d+");
const commaDivider = new Literal("comma", ",");
const numberList = new Repeat("number-list", digit, commaDivider);
const ast = numberList.exec("1,2,").ast;
ast.type // ==> "repeat"
ast.name // ==> "number-list"
ast.value // ==> "1,2
ast.children[0].value // ==> "1"
ast.children[1].value // ==> ","
ast.children[2].value // ==> "2"
ast.children.length // ==> 3
Reference
Reference is a way to handle cyclical patterns. An example of this would be arrays within arrays. Lets say we want to make a pattern that matches an array that can store numbers and arrays.
[[1, [1]], 1, 2, 3]
Here is an example of using Reference
to parse this pattern.
import { Regex, Literal, Or, Repeat, And, Reference } from "clarity-pattern-parser";
const integer = new Regex("integer", "\\d+");
const commaDivider = new Regex("comma-divider", "\\s*,\\s*");
const openBracket = new Literal("open-bracket", "[");
const closeBracket = new Literal("close-bracket", "]");
const item = new Or("item", [integer, new Reference("array")]);
const items = new Repeat("items", item, commaDivider);
const array = new And("array", [openBracket, items, closeBracket]);
const { ast } = array.exec("[[1, [1]], 1, 2, 3]");
ast.toJson();
{
"type": "and",
"name": "array",
"value": "[[1, [1]], 1, 2, 3]",
"firstIndex": 0,
"lastIndex": 18,
"startIndex": 0,
"endIndex": 19,
"children": [
{
"type": "literal",
"name": "open-bracket",
"value": "[",
"firstIndex": 0,
"lastIndex": 0,
"startIndex": 0,
"endIndex": 1,
"children": []
},
{
"type": "repeat",
"name": "items",
"value": "[1, [1]], 1, 2, 3",
"firstIndex": 1,
"lastIndex": 17,
"startIndex": 1,
"endIndex": 18,
"children": [
{
"type": "and",
"name": "array",
"value": "[1, [1]]",
"firstIndex": 1,
"lastIndex": 8,
"startIndex": 1,
"endIndex": 9,
"children": [
{
"type": "literal",
"name": "open-bracket",
"value": "[",
"firstIndex": 1,
"lastIndex": 1,
"startIndex": 1,
"endIndex": 2,
"children": []
},
{
"type": "repeat",
"name": "items",
"value": "1, [1]",
"firstIndex": 2,
"lastIndex": 7,
"startIndex": 2,
"endIndex": 8,
"children": [
{
"type": "regex",
"name": "integer",
"value": "1",
"firstIndex": 2,
"lastIndex": 2,
"startIndex": 2,
"endIndex": 3,
"children": []
},
{
"type": "regex",
"name": "comma-divider",
"value": ", ",
"firstIndex": 3,
"lastIndex": 4,
"startIndex": 3,
"endIndex": 5,
"children": []
},
{
"type": "and",
"name": "array",
"value": "[1]",
"firstIndex": 5,
"lastIndex": 7,
"startIndex": 5,
"endIndex": 8,
"children": [
{
"type": "literal",
"name": "open-bracket",
"value": "[",
"firstIndex": 5,
"lastIndex": 5,
"startIndex": 5,
"endIndex": 6,
"children": []
},
{
"type": "repeat",
"name": "items",
"value": "1",
"firstIndex": 6,
"lastIndex": 6,
"startIndex": 6,
"endIndex": 7,
"children": [
{
"type": "regex",
"name": "integer",
"value": "1",
"firstIndex": 6,
"lastIndex": 6,
"startIndex": 6,
"endIndex": 7,
"children": []
}
]
},
{
"type": "literal",
"name": "close-bracket",
"value": "]",
"firstIndex": 7,
"lastIndex": 7,
"startIndex": 7,
"endIndex": 8,
"children": []
}
]
}
]
},
{
"type": "literal",
"name": "close-bracket",
"value": "]",
"firstIndex": 8,
"lastIndex": 8,
"startIndex": 8,
"endIndex": 9,
"children": []
}
]
},
{
"type": "regex",
"name": "comma-divider",
"value": ", ",
"firstIndex": 9,
"lastIndex": 10,
"startIndex": 9,
"endIndex": 11,
"children": []
},
{
"type": "regex",
"name": "integer",
"value": "1",
"firstIndex": 11,
"lastIndex": 11,
"startIndex": 11,
"endIndex": 12,
"children": []
},
{
"type": "regex",
"name": "comma-divider",
"value": ", ",
"firstIndex": 12,
"lastIndex": 13,
"startIndex": 12,
"endIndex": 14,
"children": []
},
{
"type": "regex",
"name": "integer",
"value": "2",
"firstIndex": 14,
"lastIndex": 14,
"startIndex": 14,
"endIndex": 15,
"children": []
},
{
"type": "regex",
"name": "comma-divider",
"value": ", ",
"firstIndex": 15,
"lastIndex": 16,
"startIndex": 15,
"endIndex": 17,
"children": []
},
{
"type": "regex",
"name": "integer",
"value": "3",
"firstIndex": 17,
"lastIndex": 17,
"startIndex": 17,
"endIndex": 18,
"children": []
}
]
},
{
"type": "literal",
"name": "close-bracket",
"value": "]",
"firstIndex": 18,
"lastIndex": 18,
"startIndex": 18,
"endIndex": 19,
"children": []
}
]
}
The Reference
pattern traverses the pattern composition to find the pattern that matches the one given to it at construction. It will then clone that pattern and tell that pattern to parse the text. If it cannot find the pattern with the given name, it will throw a runtime error.
Not
Intellisense
Because the patterns are composed in a tree and the cursor remembers what patterns matched last, we can ask what tokens are next. We will discuss how you can use clarity-pattern-parser for text auto complete and other interesting approaches for intellisense.
GetTokens
The getTokens
method allow you to ask the pattern what tokens it is looking for. The Regex pattern was the only pattern that didn't already intrinsically know what patterns it was looking for, and we solved this by adding a setTokens
to its class. This allows you to define a regexp that can capture infinitely many patterns, but suggest a finite set. We will discuss this further in the setTokens section. For now we will demonstrate what getTokens
does.
import { Or, Literal } from "clarity-pattern-parser";
const jane = new Literal("jane", "Jane");
const john = new Literal("john", "John");
const jack = new Literal("jack", "Jack");
const jill = new Literal("jill", "Jill");
const names = new Or("names", [jane, john, jack, jill]);
names.getTokens();
["Jane", "John", "Jack", "Jill"]