ts-binary
v0.11.0
Published
A collection of helper functions to serialize data structures in typescript to rust bincode format
Downloads
235
Readme
ts-binary
A collection of helper functions to serialize/deserialize primitive types in typescript. It is designed to be a building block rather than a standalone library.
Install
npm install ts-binary --save
Example
import { Sink, write_str, read_str, write_f32, read_f32 } from 'ts-binary';
let sink: Sink = Sink(new ArrayBuffer(100));
// Sink is just a convince function not a class ctor
// note that it resizes automatically if needed
sink = write_str(sink, 'abc');
sink = write_f32(sink, 3.14);
sink.pos = 0; // reset position to read from the beginning
const str = read_str(sink); // 'abc'
const num = read_f32(sink); // 3.14, actually 3.140000104904175 :)
Why
The project was created as a way to communicate between rust and typescript (checkout https://github.com/cztomsik/stain). For a while we used json, but it wasn't efficient enough (plus there were some pain points to maintain type compatibility between rust and typescript). Rust already had a standard way to serialize stuff: https://serde.rs/ + https://github.com/TyOverby/bincode. So the solution was to adopt serde type system + binary layout from bincode.
The idea:
- Codegen contract types based on the schema + codegen serializers/deserializers for them.
- Write a small library to serialize/deserialize primitive types (string, boolean, optional, sequence, f32, u32, ...)
This library is an implementation of 2)
Design goals
- Be compatible with bincode and serde (https://github.com/TyOverby/bincode)
- Fast -> try to be JIT friendly if possible
- Small -> optimized for js minification (it is just a collection of functions)
- Designed as a building block -> it doesn't do much outside of serializing/deserializing primitive types.
- Should work across all modern runtimes (browsers + node)
Api
the api is based on three concepts:
export type Sink = {
pos: number;
view: DataView;
littleEndian: boolean;
};
type Serializer<T> = (sink: Sink, val: T) => Sink;
type Deserializer<T> = (sink: Sink) => T;
Sink
is used for both serialization and deserialization. It is just a buffer with a current position to read/write from. Sink instance is designed to be reused (normally you will have a single buffer/sink to read and write from).
Serializer<T>
is a function that writes a value of type T to the sink
starting from sink.pos
(moves pos
in the process). It is assumed that it will resize the buffer if it needs more space (important if you want to implement custom serializers). Note that it always returns a Sink
which can be a brand new instance. You cannot rely on that the initial sink will be mutated, always use the returned value instead.
Deserializer<T>
is a function that reads a value of type T from the sink
starting from sink.pos
(moves pos
in the process). Deserializers in this library don't check for out of boundary cases. It is assumed that the correct data is just there.
And because these are just types/conventions it is easy to support custom types. Which is exactly how more complex data structures are handled in ts-rust-bridge-codegen
Supported primitives
At the moment only the list of supported types:
u8, u16, u32, u64, i32, f32, f64, string, boolean, Option<T>, Sequence<T>
.
Caveat: u64 is serialized as 8 bytes but only 4 bytes are actually being used. Which means that technically only u32 values are supported (but nothing stops you from implementing that yourself :) ).
Numbers:
const write_u8: Serializer<number>;
const write_u16: Serializer<number>;
const write_u32: Serializer<number>;
const write_u64: Serializer<number>;
const write_f32: Serializer<number>;
const write_f64: Serializer<number>;
const write_i32: Serializer<number>;
const read_u8: Deserializer<number>;
const read_u16: Deserializer<number>;
const read_u32: Deserializer<number>;
const read_u64: Deserializer<number>;
const read_f32: Deserializer<number>;
const read_f64: Deserializer<number>;
const read_i32: Deserializer<number>;
Strings
Note that strings are stored as UTF8 via TextEncoder
class (and TextDecoder
on the way back). It is serialized as u64 (number of bytes encoded) + UTF8 encoded string.
const write_str: Serializer<string>;
const read_str: Deserializer<string>;
Bool
Encoded as 1 or 0 in a single byte.
const write_bool: Serializer<boolean>;
const read_bool: Deserializer<boolean>;
Sequence
You can make new serializer and deserializer for T[]
out of serializer/deserializer of type T
. It encodes a sequence as u64 + serialized elements.
const seq_writer: <T>(serEl: Serializer<T>) => Serializer<T[]>;
const seq_reader: <T>(readEl: Deserializer<T>) => Deserializer<T[]>;
Optional
Optionals are described as T | undefined
(but not null
). Encoded as 1 (one byte) followed up by a serialized value T if !== undefined or just 0 (one byte).
// similar to sequence they produce new serializer/deserializer
const opt_writer: <T>(serEl: Serializer<T>) => Serializer<T | undefined>;
const opt_reader: <T>(readEl: Deserializer<T>) => Deserializer<T | undefined>;
Nullable
In case you need to deal with nulls, there is also a nullable version that is described as T | null
(but not undefined
). Encoded as 1 (one byte) followed up by a serialized value T if !== null or just 0 (one byte).
const nullable_writer: <T>(serEl: Serializer<T>) => Serializer<T | null>;
const nullable_reader: <T>(readEl: Deserializer<T>) => Deserializer<T | null>;
Caveat: Nullable<Nullable<T>>
equals to just Nullable<T>
in typescript, but in rust this is technically not true.
Simple benchmarks
I just copypasted generated code from examples and tried to construct a simple benchmark.
Code https://stackblitz.com/edit/ts-binaray-benchmark?file=index.ts
Version to try https://ts-binaray-benchmark.stackblitz.io
On complex data structure:
| Method | Serialization | Deserialization | | --------- | :-----------: | --------------: | | ts-binary | 74 ms | 91 ms | | JSON | 641 ms | 405 ms |
Simple data structure:
| Method | Serialization | Deserialization | | --------- | :-----------: | --------------: | | ts-binary | 2 ms | 1 ms | | JSON | 6 ms | 5 ms |
That was measured on latest Safari version.
Note you can run the benchmark yourself cloning the repo + running npm scripts
FAQ
Why _
in function names?
I wanted something that would signal if it is a library serializer vs custom in my generated code (I use camel casing for custom ones). Plus I didn't really like how writeF32
looked :)
License
MIT.