tpwatson
v0.1.3
Published
This utility is a prototype design and implementation of the WAT# programming language. WAT# is a lightweight programming language that transpiles its output to WebAssembly text format. Its aim is that you can easily add powerful WebAssembly code to your
Downloads
3
Maintainers
Readme
WebAssembly transpiler from WAT# to WebAssembly text format
This utility is a prototype design and implementation of the WAT# programming language. WAT# is a lightweight programming language that transpiles its output to WebAssembly text format. Its aim is that you can easily add powerful WebAssembly code to your JavaScript code, which rely on the performance of native WA.
WAT# implements only simple programming constructs that do not need any WebAssembly Runtime. The programming language avoids concepts that are not straightforward in WebAssembly.
Main Features
You can use the standard WAT instructions within the WA#, plus a number of extensions:
- Additional simple types:
bool
,i8
(sbyte
),u8
(byte
),i16
(word
),u16
(uword
),i32
(int
),u32
(uint
),i64
(long
),u64
(ulong
),f32
(float
),f64
(double
) - Compound types: arrays, structs, pointers
- Constant value definitions
- Include directive (
#include
) to allow using include source files - Conditional directives (
#define
,#undef
,#if
,#else
,#elseif
,#endif
) to define conditional compilation based on symbols - Memory variable, memory arrays
- Simple pointer arithmetic
- Control flow statements:
if..else
,do..while
,while
, andfor
. - Expression evaluation
- Inline functions
These are the steps the WAT# transpiler generates its output:
- Preprocessing. The preprocessor detects the conditional and include directives. Using the conditions and included files, it merges the raw input to be used as the next phase input. This phase focuses only on preprocessing and does not check the WA# syntax at all. However, preprocessing recognizes WA# multi-line block comments and does not search for preprocessor directives within them.
- Syntax parsing. The compiler parses the syntax of the raw WA# source code and collects parsing errors.
- Semantic analysis. The compiler checks if the code semantics satisfy the language specification and prepare it for code emission.
- Code emission. The compiler generates the WAT output.
Comments and Whitespaces
The space, tabulator, carriage return (0x0d) and new line (0x0a) characters are all whitespaces.
WA# supports two types of comments:
Block comments can be multi-line but cannot nested into each other.
blockComment
: "/*" any* "*/"
;
End-of-line comments can start at any point of the current line and comple with the end of the line
eolComment
: ( "//" | ";;" ) (^newLine)*
;
WAT# Program Structure
WAT# follows the semantics of WebAssembly. The result of the WAT# compilation is a WebAssembly module that contains module fields (global declarations, type declarations, memory description, tables, data elements, functions, etc.). One particular field type is a function, which may declare parameters, result types, local variables, and instrcutions.
A WAT# code has this structure:
watsharpCode
: declaration*
;
declaration
: constDeclaration
| globalDeclaration
| typeDeclaration
| jumpTableDeclaration
| dataDeclaration
| variableDeclaration
| importedFunctionDeclaration
| functionDeclaration
| emptyDeclaration
;
functionDeclaration
: "function" (exportSpecification)? ("inline")? (intrinsicType | "void" )?
parameterList locals functionBody
;
emptyDeclaration
: ";"
;
locals
: local*
;
functionBody
: statement*
;
exportSpecification
: "export" (stringLiteral)?
;
declaration
: Declarations define language constructs that build up the structure of the code, but do not contain directly any executable code.constDeclaration
: You can assign a constant value to a name. During the compilation process, the transpiler replaces the occurrences of the constans's name with its value. Contansts do not generate any WebAssembly elements.globalDeclaration
: You can create variables that the transpiler handle asglobal
WebAssembly declarations. Though WebAssembly allows constantglobal
declarations, WAT# allows only mutables.typeDeclaration
: You can create type aliases or compound type declarations that are entirely WAT# constructs.variableDeclaration
: You can define variables that are stored in the linear memory of WebAssembly.jumpTableDeclaration
: WAT# provides programming constructs that allow indirect function calls based on thetable
concept of WebAssembly. Jump tables define the tables used for an indirect call.functionDeclaration
: The transpiler creates a WebAssembly function from each WAT# function. Thus, functions can have parameters, an optional result type, and local declarations. Functions are the only constructs that contain statements.
Preprocessor directives
When you start the compilation of a WA# code, you can pass predefined symbols to the compiler. Those symbols can be used in conditional preprocessor directives. WA# supports these preprocessor directives:
#define
: Defines a symbol#undef
: Removes a symbol (is if it were not defined)#if
: Checks if condition is satisfied#else
: A branch for an unsatisfied condition#elseif
: A branch for an alternative test#endif
: End of an#if
condition#include
: includes the contents of another file in the compilation
A directive must start at the first non-whitespace character of a source code line. Directives cannot contain comments, they must be completed on the same line they start.
#define
and #undef
Use these directives to define new symbols or remove exsiting symbols.
Syntax:
ppDefine
: "#define" ppIdentifier
;
ppUndef
: "#undef" ppIdentifier
;
ppIdentifier
: ppIdStart idCont*
;
ppIdStart
: "a" .. "z"
| "A" .. "Z"
| "_"
;
#if
, #else
, #elseif
, and #endif
These preprocessor directives allow you to define conditional compilation:
#if Symbol1
// Declare code when Symbol1 is declared
#elseif Symbol2 && !Symbol3
// Declare here code when Symbol2 is declared, but not Symbol3
#else
// All other cases
#endif
Syntax:
ppIf
: "#if" ppExpr
waSharpCode
(
"#elseif" ppExpr
waSharpCode
)*
(
"#else" ppExpr
waSharpCode
)?
"#endif"
;
ppExpr
: "(" ppExpr ")"
| ppOrExpr
;
ppOrExpr
: ppXorExpr ( "|" ppXorExpr )?
;
ppXorExpr
: ppAndExpr ( "^" ppAndExpr )?
;
ppAndExpr
: ppPrimaryuExpr ( "&" ppPrimaryExpr )?
;
ppPrimaryExpr
: ppIdentifier
| "!" ppIdentifier
;
#include
This directive includes the contents of another file in the compilation.
Syntax:
#include
: "#include" stringLiteral
;
WAT# Type System
WAT# provides additional types to the four intrinsic WebAssembly type (i32
, i64
, f32
, f64
) so that every extra type's operations can be implemented with the WA's types — without any runtime code.
WAT# deliberately does not support strings and characters.
Simple WAT# Types
WAT# supports these types:
Types that leverage the
i32
WA type:bool
: Boolean value. Any non-zeroi32
value is considered astrue
. Zero isfalse
.sbyte
ori8
: 8-bit signed integer (-128 .. 127)byte
oru8
: 8-bit unsigned integer (0 .. 255)short
ori16
: 16-bit signed integer (-32768 .. 32767)ushort
oru16
: 16-bit unsigned integer (0 .. 65535)int
ori32
: 32-bit signed integer (-2 147 483 648 .. -2 147 483 647)uint
oru32
: 32-bit unsigned integer (0 .. 4 294 967 295)
Types that leverage the
i64
WA type:long
ori64
: 32-bit signed integer (-9 223 372 036 854 775 808 .. 9 223 372 036 854 775 807)ulong
oru64
: 64-bit unsigned integer (0 .. 18 446 744 073 709 551 615)
Types that leverage the floating-point types in WA:
float
orf32
: 32-bit floating-point typedouble
orf64
: 64-bit floating point type
Arrays
WAT# allows declaring arrays with one or more dimension. Array items can be an instance of any types.
Structures
WAT# structures are records with fields. Each field has a name and a type. Record fields can be accessed individually.
Pointers
You can have pointers to instances of a specific type. A pointer is a 32-bit value that points to a WebAssembly linear memory location (using the i32
WebAssembly type). When you execute operations with a pointer, those consider the underlying type. For example, when you increment the value of a pointer, that leverage the byte-length of the underlying instance.
Compound Types
You can build compound types from single types by combining them. For example, you may have an array of pointers of structures. Structure fields may be arrays and pointer, or even other structures, and so on.
Type Declaration Syntax
typeDeclaration
: "type" identifier "=" typeSpecification ";"
;
typeSpecification
: primaryTypeSpecification ("[" expr "]")*
;
primaryTypeSpecification
: instrinsicType
| identifier
| pointerType
| "(" typeSpecification ")"
| structType
;
intrinsicType
: "i8" | "sbyte" | "u8" | "byte"
| "i16" | "short" | "u16" | "ushort"
| "i32" | "int" | "u32" | "uint"
| "i64" | "long" | "u64" | "ulong"
| "f32" | "float"
| "f64" | "double"
;
pointerType
: "*" primaryTypeSpecification
;
structType
: "struct" "{" structField ("," structField)* "}"
;
structField
: typeSpecification identifier
;
Constant values
Constants are name and value pairs. Their value is calculated during compilation time, and are their names are replaced with their value.
Syntax:
constDeclaration
: "const" intrinsicType identifier "=" expr
;
Globals
globalDeclaration
: "global" intrinsicType identifier ("=" expr)? ";"
;
Jump tables
Syntax:
jumpTableDeclaration
: "table" identifier "(" functionParam? ("," functionParam)* ")" "{" identifier? ("," identifier)* "}" ";"
;
Data declaration
Data declarations can be used to init memory the variables' data
Syntax:
dataDeclaration
: "data" (integralType)? identifier "[" expr? ("," expr)* "]"
;
Variables
Syntax:
variableDeclaration
: typeSpecification identifier ( "{" expr "}" ) ";"
;
Imported function declaration
Syntax:
importedFunctionDeclaration
: "import" ("void" | intrinsicType) identifier stringLiteral stringLiteral
"(" intrinsicType? ("," intrinsicType)* ")" ";"
;
Function declarations
Syntax:
functionDeclaration
: functionModifier? ("void" | intrinsicType) identifier
"(" functionParam? ("," functionParam)* ")"
"{" bodyStatement* "}"
;
functionModifier
: "export"
| "inline"
;
functionParam
: typeSpecification identifier
;
Statements
Syntax:
bodyStatement
: emptyStatement
| blockStatement
| localVariableDeclaration
| assignment
| localFunctionInvocation
| controlFlowStatement
;
emptyStatement
: ";"
;
blockStatement :=
"{" statement? (";" statement)* }
;
Local Variable Statement
Syntax:
localVariableStatement
: "local" (intrinsicType | pointerType) identifier ("=" expr) ";"
;
Assignments
Syntax:
assignment
: leftValue ( "=" | "+=" | "-=" | "*=" | "/=" | "%=" | "^|" | "&=" | "|=" | "~=" | "!=" )
expr ";"
;
leftValue
: "*" leftValue
| addressable
;
addressable
: identifier (("[" expr "]") | ("." addressable))*
;
Local Function Invocation
Syntax:
localFunctionInvocation
: identifier "(" expr? ("," expr)* ")" ";"
;
Control flow statements
Syntax:
controlFlowStatement
: ifStatement
| whileStatement
| doWhileStatement
| breakStatement
| continueStatement
| returnStatement
;
ifStatement
: "if" "(" condition ")" statement ("else" statement)?
;
whileStatement
: "while" "(" expr ")" statement
;
doWhileStatement
: "do" statement "while" "(" expr ")" ";"
;
breakStatement
: "break" ";"
;
continueStatement
: "continue" ";"
;
returnStatement
: "return" expr? ";"
;
Expressions
Syntax:
expr
: conditionalExpr
;
conditionalExpr
: orExpr ( "?" expr ":" expr )?
;
orExpr
: xorExpr ( "|" xorExpr )?
;
xorExpr
: andExpr ( "^" andExpr )?
;
andExpr
: equExpr ( "&" equExpr )?
;
equExpr
: relExpr ( ( "==" | "!=" ) relExpr )?
;
relExpr
: shiftExpr ( ( "<" | "<=" | ">" | ">=" ) shiftExpr )?
;
shiftExpr
: addExpr ( ( "<<" | ">>" | ">>>" ) addExpr )?
;
addExpr
: multExpr ( ( "+" | "-" ) multExpr )?
;
multExpr
: memberOrIndexExpr ( ( "*" | "/" | "%") memberOrIndexExpr )?
;
memberOrIndexExpression
: memberExpression
| indexExpression
;
memberExpression
: primaryExpression "." expr
;
indexExpression
: primaryExpression "[" expr "]"
;
primaryExpr
: "sizeof" "(" typeSpec ")"
| funcInvocation
| literal
| identifier
| unaryExpr
| "(" expr ")"
;
functionInvocation
: identifier "(" expr? ("," expr)* ")"
;
literal
: binaryLiteral
| decimalLiteral
| hexadecimalLiteral
| realLiteral
;
unaryExpr
: ( | "+" | "-" | "~" | "!" | "&" | "*" ) unaryExpr
;
binaryLiteral
: "0b" binaryDigit ("_"? binaryDigit)*
;
decimalLiteral
: decimalDigit ("_" ? decimalDigit)*
;
hexaDecimalLiteral
: "0x" hexaDigit ("_"? hexaDigit)*
;
realLiteral
: decimalDigit ("_"? decimalDigit)* "." decimalDigit+ (("e" | "E") ("+" | "-")? decimalDigit+)?
| decimalDigit+ (("e" | "E") ("+" | "-")? decimalDigit+)
;
binaryDigit
: "0" | "1"
;
decimalDigit
: "0" .. "9"
;
hexaDigit
: "0" .. "9"
| "a" .. "f"
| "A" .. "F"
;
Reference
Reserved keywords
bool
, byte
, const
, double
, f32
, f64
, float
, i16
, i32
, i64
, i8
, int
, long
, real
, sbyte
, short
, u8
, u16
, u32
, u64
, uint
, ulong
, ushort
, uword
, word
Syntax description
constDeclaration :=
"const" valueType identifier "=" expr
;
structDeclaration :=
"struct" structIdentifier "{" fieldDeclaration ("," fieldDeclaration)* "}"
;
fieldDeclaration :=
type identifier ";"
;
variableDeclaration :=
type identifier dimension? ( "=" expr)?
;
dimension :=
"[" expr "]"
;
type :=
| valueType
| structIdentifier
;
valueType :=
| boolType
| integralType
| floatingPointType
;
boolType := "bool" ;
integralType :=
| i8Type
| u8Type
| i16Type
| u16Type
| i32Type
| u32Type
| i64Type
| u64Type
;
floatingPointType :=
| f32Type
| f64Type
;
i8Type :=
| "i8"
| "sbyte"
;
u8Type :=
| "u8"
| "byte"
;
i16Type :=
| "i16"
| "word"
| "short"
;
u16Type :=
| "u16"
| "uword"
| "ushort"
;
i32Type :=
| "i32"
| "int"
;
u32Type :=
| "u32"
| "uint"
;
i64Type :=
| "i64"
| "long"
;
u64Type :=
| "u64"
| "ulong"
;
f32Type :=
| "f32"
| "float"
| "real"
;
f64Type :=
| "f64"
| "double"
;
identifier :=
"$" idCont+
idCont :=
| "a" .. "z"
| "A" .. "Z"
| "0" .. "9"
| "_"
| "."
;
structIdentifier :=
"#" idCont+
;