syntax-source
v0.4.3
Published
Generate high-resolution and _complete_ syntax definitions for languages by feeding this API a set of YAML input files that mimic and extend the [Sublime Text 3 Syntax Definition Schema](https://www.sublimetext.com/docs/3/syntax.html).
Downloads
3
Readme
Syntax Source
Generate high-resolution and complete syntax definitions for languages by feeding this API a set of YAML input files that mimic and extend the Sublime Text 3 Syntax Definition Schema.
Examples
API Documentation
Example build script:
#!/usr/bin/env node
const syntax_source = require('syntax-source');
const export_sublime_syntax = syntax_source.export['sublime-syntax'];
(async() => {
// load syntax source from a path string
let k_syntax = await syntax_source.transform({
path: process.argv[2],
// optional custom exstensions
extensions: {
_customKey: (k_context, k_rule, s_version) => {
let s_ext = k_context.syntax.ext;
// remove source rule from context
let i_rule = k_context.drop(k_rule);
// create new rule
k_context.insert(i_rule++, {
match: `(myCustomRegex)(:)`,
captures: [
`keyword.other.word.my-custom-regex.SYNTAX`,
`punctuation.separator.custom.colon.SYNTAX`,
],
pop: true,
});
return i_rule;
},
},
});
// write to stdout
process.stdout.write(export_sublime_syntax(k_syntax, {
// optional post-processing transform
post: g_yaml => ({
...g_yaml,
name: `${g_yaml.name} (MyPackage)`,
}),
}));
})().catch((e_compile) => {
console.error(e_compile.stack);
process.exit(1);
});
Documentation for Input Files
The input files closely follow the .sublime-syntax
format (an extension of YAML), but the extension should be .syntax-source
so the highlighting works and so that Sublime Text 3 does not try loading the source files as syntax definitions.
Primer
This API operates under the assumption that every context intends to match the full range of tokens expected at that state in the grammar. If none of the rules in a given context are matched by the input, an implcitly generated 'catch-all' rule at the end will mark the text invalid (via the invalid.illega.token.expected.CONTEXT.SYNTAX
scope) and pop the context from the stack (equivalent to the throw
alias)
Top Level Key Extensions
In addition to the regular name
, file_extensions
, scope
, etc., you can also use the following keys at the root of the definition structure:
extends
:filename
- inherit all the variables and contexts defined in the given.syntax-source
, overriding any duplicate keys.
Context Quantifiers
When using the set
or push
actions, you can essentially quantify a context (which creates a new ad-hoc context) by appending one of the following characters to the target name:
?
- existential quantifier: include the given context and then pop if none of its rules matched.*
- zero or more quantifier: if the context's lookahead matches, repeatedly push it to the stack until it matches no more; then/otherwise pop.^
- exactly one quantifier: will throw if the given context does not match.+
- one or more quantifier: essentially same as- goto: [CONTEXT*, CONTEXT^]
Global Substitutions
Anytime the following placeholder text appears in a scope name, it will be automatically substituted:
SYNTAX
- the syntax id (without thescope.
/text.
/markup.
prefix)
Rule Aliases
Instead of providing a mapping value for each item in the content of a context, you can use a rule alias to shortcut common patterns. The value must be a single string given by one of the following values:
alone
Declare this context stands alone, i.e., it does not inherit the prototype context. Equivalent to:
- meta_include_prototype: false
bail
If none of the previous rules match, pop this context from the stack. Equivalent to:
- include: _OTHERWISE_POP
continue
Match a single character at a time to keep it in this context. Equivalent to:
- match: '.'
throw
If none of the previous rules match, mark the text invalid and pop this context from the stack. Equivalent to:
- match: '{{_SOMETHING}}'
scope: invalid.illegal.token.expected.CONTEXT.SYNTAX
pop: true
retry
If none of the previous rules match, mark the text invalid but do not pop this context from the stack. Equivalent to:
- match: '{{_SOMETHING}}'
scope: invalid.illegal.token.expected.CONTEXT.SYNTAX
Key Extensions
Instead of, or in addition to, using the regular match
, include
, push
, set
, etc., the following keys within the content of a context have the following effects:
goto[.set|push]
: context | array<context>
If this rule is reached, it will immediately change state to the given context(s).
Modifiers:
.set
- DEFAULT: use theset
action to change state..push
- use thepush
action to change state.
Example:
contexts:
main:
- match: 'hi'
- goto: [root_MORE, root]
root:
- match: 'you'
scope: keyword.you
- match: 'world'
scope: keyword.world
- goto.push: other
root_MORE:
- match: ','
scope: punctuation.separator
set: main
Using the Sublime Syntax exporter yields:
variables:
_ANYTHING_LOOKAHEAD: '(?=[\S\s])'
contexts:
main:
- match: 'hi'
scope: keyword.hi
- match: '{{_ANYTHING_LOOKAHEAD}}'
set: [root_MORE, root]
root:
- match: 'you'
scope: keyword.you
- match: 'world'
scope: keyword.world
- match: '{{_ANYTHING_LOOKAHEAD}}'
push: other
root_MORE:
- match: ','
scope: punctuation.separator
set: main
switch[.set|push]
: array<context | mappings>
Generate a series of rules where each one attempts to match a lookahead regex and consequently change state to the given context.
Modifiers:
.set
- DEFAULT: use theset
action to change state..push
- use thepush
action to change state.
Example:
contexts:
main:
- switch:
- hello
- world
- other: world
Using the Sublime Syntax exporter yields:
variables:
_ANYTHING_LOOKAHEAD: '(?=[\S\s])'
contexts:
main:
- match: '{{hello_LOOKAHEAD}}'
set: hello
- match: '{{world_LOOKAHEAD}}'
set: world
- match: '{{other_LOOKAHEAD}}'
set: world
word[.CASE]
: text | array<text>
words[.CASE]
: text | array<text>
Generate a series of rules where each one attempts to match a case-sensitive (or insensitive) variation of the given text.
This rule automatically adds the regex 'WORD{{_WORD_BOUNDARY}}'
to the context's lookahead pattern variable, where WORD
is the supplied text value.
Keep in mind that these lookaheads will not uselessly bloat your output because any unused variables are automatically removed during compilation, much like dead code removal. You can override the generated lookahead by adding a
- lookahead: 'regex'
rule to the context.
CASE
(Modifiers):
.auto
- DEFAULT: matches the best fit using either.mixed
or.camel
depending on the case of the text. If all characters given are lower case, it will use.mixed
, otherwise it will use.camel
..mixed
- attempts matches the following cases in order:lower
,upper
,proper
ormixed
..camel
- attempts matches the following cases in order:camel
,pascal
,lower
,upper
,proper
ormixed
..lower_camel
- will only matchcamel
..pascal
- will only matchpascal
..lower
- will only matchlower
..upper
- will only matchupper
..exact
- enables case-sensitivity.
Scope Substitutions: Scopes attached to this rule may use the following placeholder substitutions:
WORD
- the word that was matched in lower case.CASE
- will be replaced by one of the following values depending on the text that was matched:lower
- e.g., strmatchesupper
- e.g., STRMATCHESproper
- e.g., Strmatchesmixed
- e.g., sTrmAtChEscamel
- e.g., strMatchespascal
- e.g., StrMatchesexact
- only if the.exact
modifier was used
Optional Keys:
The generated rule will add the default scope keyword.operator.word.TYPE.WORD.SYNTAX
, where TYPE
defaults to empty but can be overriden using the type
key (or the whole scope overriden using the scope
key -- see below).
The generated rule will also automatically add the supplementary scope meta.case.CASE.SYNTAX
to the matched text, where CASE
is given by one the values listed above in Scope Substitutions.
The following optional keys can be used:
type
:text
- provides some scope subname to use forTYPE
in the default scope that is applied. Has no effect ifscope
is used.scope
:scope
- overrides the default scope.boundary
:regex
- overrides the default boundary lookahead regex (_WORD_BOUNDARY
) to match at the end of the keyword.
Example:
contexts:
boolean:
- words:
- 'true'
- 'false'
scope: constant.language.boolean.WORD.SYNTAX
pop: true
Using the Sublime Syntax exporter yields:
contexts:
bolean:
- match: 'TRUE(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.upper.rq
pop: true
- match: 'true(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.lower.rq
pop: true
- match: 'True(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.proper.rq
pop: true
- match: '(?i)true(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.mixed.rq
pop: true
- match: 'FALSE(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.upper.rq
pop: true
- match: 'false(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.lower.rq
pop: true
- match: 'False(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.proper.rq
pop: true
- match: '(?i)false(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.mixed.rq
pop: true
mask
: scope
Before changing state, push a context to the stack that will apply the given scope to all contexts put on the stack by some action. This is useful when you want to apply different scopes to a token depending on where it is in the grammar while reusing the same context to match the token itself. Can only be used in combination with a rule that has a set
or push
action.
Example:
contexts:
predicate:
- goto: namedNode
mask: meta.term.role.predicate.SYNTAX
object:
- goto: namedNode
mask: meta.term.role.object.SYNTAX
Using the Sublime Syntax exporter yields:
contexts:
predicate:
- match: '{{_ANYTHING_LOOKAHEAD}}'
set: [meta_term_role_predicate__MASK, namedNode]
meta_term_role_predicate__MASK:
- meta_include_prototype: false
- meta_content_scope: meta.term.role.predicate.rq
- match: '{{_ANYTHING_LOOKAHEAD}}'
pop: true
object:
- match: '{{_ANYTHING_LOOKAHEAD}}'
set: [meta_term_role_object__MASK, namedNode]
meta_term_role_object__MASK:
- meta_include_prototype: false
- meta_content_scope: meta.term.role.object.rq
- match: '{{_ANYTHING_LOOKAHEAD}}'
pop: true
add[.back|front]
: scope
Rather than replacing the generated scope, add the given scope to the output either at the .back
(binds tigther to the token) or at the .front
(binds looser to the token).
open[.SYMBOL]
: subscope
close[.SYMBOL]
: subscope
Generates a rule that matches either the opening of closing of the given SYMBOL
, and sets the scope according to the following table.
SYMBOL
and it's corresponding open/close character, respectively, followed by the scope it applies. SIDE
is either begin
or end
accordingly:
paren
-(
)
-punctuation.definition.SUBSCOPE.SIDE.SYNTAX
brace
-{
}
-punctuation.section.SUBSCOPE.SIDE.SYNTAX
bracket
-[
]
-punctuation.definition.SUBSCOPE.SIDE.SYNTAX
tag
-<
>
-punctuation.definition.SUBSCOPE.SIDE.SYNTAX
irk
-'
'
-punctuation.definition.string.SIDE.SUBSCOPE.SYNTAX
dirk
-"
"
-punctuation.definition.string.SIDE.SUBSCOPE.SYNTAX
lookahead[s]
: string | array<string>
Override the generated lookahead regular expression.