@c-syn/regext
v0.0.7
Published
Transforms regular expressions into dynamic templates for flexible text generation.
Downloads
11
Maintainers
Readme
RegExp to Template Converter
This package provides a class to convert regular expressions into template strings. It currently supports both a Mustache and plain style. The underlying function is strongly dependent on the regexp-tree package, which is used to parse the regular expression into an abstract syntax tree (AST). Its parser module is generated from the regexp grammar, which is based on the regular expressions grammar used in ECMAScript. After parsing, the AST is then traversed to generate the template string.
Installation
npm install @c-syn/regext
Usage
import RegExT from "@c-syn/regext"
const regexp = /.../ // Your regular expression
const type = "Mustache" // or "plain"
const template = new RegExT(regexp, type).template
[!TIP] Try this code in the RegExT Playground.
RegExT is a class that takes two parameters: a regular expression and a type. The class has a toString
method that returns the template string, it contains the following properties. The convert
function that is used to generate the template string is also exposed import { convert } from 'regex-template'
.
| Property | Type | Description |
| --- | --- | --- |
| regexp
| RegExp
| The regular expression that was used to generate the template string. |
| type
| string
| The type of template string that was generated. |
| template
| string
| The template string that was generated. |
type
The type
parameter is a string that specifies the type of template string that should be generated. The following types are supported. If no type, or an unknown type, is specified, type is set to Mustache
.
| Type | Description |
| --- | --- |
| Mustache
| A template string that uses the Mustache syntax. |
| plain
| A plain template string. |
Example
/(?:(?<=[^`\\])|^)\[(?=[^@\n\]]+\]\([^@)]*@[:a-z0-9_-]*\))(?<showtext>[^@\n\]]+)\]\((?:(?:(?<type>[a-z0-9_-]*):)?)(?:(?<term>[^@\n:#)]*?)?(?:#(?<trait>[^@\n:#)]*))?)?@(?<scopetag>[a-z0-9_-]*)(?::(?<vsntag>[a-z0-9_-]*))?\)/g
The regular expression above is used to match a markdown link with a specific format. Depending on the specified type, the regular expression will be converted into the following templates.
Mustache
[{{showtext}}]({{#type}}{{type}}:{{/type}}{{term}}{{#trait}}#{{trait}}{{/trait}}@{{scopetag}}{{#vsntag}}:{{vsntag}}{{/vsntag}})
Plain
[showtext](type:term#trait@scopetag:vsntag)
Use Cases
The generated template string can be used to convert any texts into texts that match the regular expression. Conversion can be done by replacing the template string with the corresponding values. For instance, after interpreting a text, it is converted using Handlebars to match another regular expression.
The generated template string can also be used to quickly visualize the structure of the regular expression. This can be useful when debugging or when trying to understand the structure of a regular expression.
Process
After the regular expression is parsed into an abstract syntax tree (AST), the AST is traversed to generate the template string. The following steps are walked through on every node, mostly depending on the specific type.
- Certain types of nodes are removed (i.e.,
ClassRange
,Disjunction
,Assertion
) as they may containChar
nodes that should not be included in the template string. For instance: ranges of characters that the regular expression is supposed to match are removed. Repetition
nodes are changed toAlternative
nodes if the Quantifier'sfrom
andto
properties are identical. Within the changed node, theRepetition
node is repeatedn
times.Group
nodes are checked for theircapturing
property. If it is set totrue
, the group is converted into aChar
node based on the specified template string type. Either thename
, if it exists, or thenumber
property is used to identify the group in the template string.Alternative
(or concatenation) nodes are handled according to the specified template string type. For instance, in the case of using theMustache
type, theAlternative
node is wrapped in asection
block (e.g.,{{#person}}{{person}} exists{{/person}}
).
After these steps halve been walked through, every Char
node its value is concatenated, resulting in the template string.
Limitations
Due to the nature of the process used to generate the template string, the following limitations apply.
- Nodes of types
ClassRange
,Disjunction
, andAssertion
are removed from the AST. This means that the template string will not contain any information about these nodes. Backreferences
are not (yet) supported. This means that the template string will not contain any information about backreferences.Quantifiers
(repetitions) are only handled if thefrom
andto
properties of the node are identical. In this case, the number of times the expressions is supposed to be repeated is clear. If theQuantifier
applies to aCharacter class
, it is ignored; as this means 'one of' a list ofChars
is supposed to picked, which can't be assumed. If thefrom
andto
properties are different, the corresponding expression is not repeated in the template string.Meta characters
representing ranges of characters (e.g.,.
,\s
,\w
) are ignored. This means that the template string will not contain any information about these nodes.