minson
v1.1.3
Published
Data serializer with minimal output.
Downloads
8
Maintainers
Readme
MINSON
Data serializer with minimal output.
Serializes an object (or other variable) using a predefined template into a non-human readable output string that uses a minimal amount of characters.
WARNING: This package is relatively new - use with vigilant caution.
Designed for storage of app configs in text files, therefore works particularly well with multiple-choice values such as booleans and enums. Also efficient with integers that have a small maximum value, and length-limited strings.
Will happily handle large strings, floats, big ints, and nested structures.
The catch is that to get the most out of Minson you have to create a template that defines your data types. The template is like a key to unlock your data.
Output
Given a JavaScript object value Minson creates a string that looks something like this:
HÿÿÿÓW@ux£×\n=q`╗ÅFRV6²═'&÷vâ═f÷═§V×2═÷fW"FR═Ƨ═Förà¶Fƶ£6Fƶ£&Ài§¢É#jËu ╔·²& öæR#£Â'Gvò#£"Â'F&VR#£7Þø qìÑÍÐÄèàÌ°ÑÍÐÈèÌÐÌÌÉô
MINSON vs JSON vs BSON
| | Output length in chars | Output length in bytes | |--------------------- |------------------------ |------------------------ | | Minified Object | 308 | 308 | | JSON | 351 | 351 | | BSON | 361 | 361 | | MINSON | 151 | 202 |
The test data used here is the input object from the test "should encode and decode" in this package. With a well designed template and input data the results for Minson will be even better.
Installation
This is a Node.JS module available from the Node Package Manager (NPM).
https://www.npmjs.com/package/minson
Here's the command to download and install from NPM:
npm install minson -S
or with Yarn:
yarn add minson
It is recommend to use a package locking system like Yarn in case a change is introduced into this project that makes it incompatible with your encoded data.
Usage
Include Minson in your project:
var Minson = require('Minson');
Create a template that describes your data.
For objects, the template is an object with the same keys as the data object, and the values are configurations strings containing the Minson type names (and optionally some additional config in parenthesis, brackets, and braces):
var myData = {
key: '3f18ac06',
title: 'My Data',
description: 'This is an example object',
timestamp: 1563797326,
enabled: true,
revision: 192,
status: 'published',
user: {
id: 3006,
name: 'Harry',
interests: ['fish', 'stamps'],
},
access: true,
};
var template = {
key: 'varchar(255)',
title: 'varchar(255)["untitled"]',
description: 'varchar',
timestamp: 'uint(32)',
enabled: 'bool',
revision: 'uint(16)',
status: 'enum("unpublished", "pending", "published")',
user: {
id: 'uint(16)',
name: 'varchar(255)',
interests: ['varchar(255)'],
},
access: 'bool',
};
If your data is an array, the template is also an array.
var primeDigits = [2, 3, 5, 7];
var template = ['uint(8)'];
If only one configuration string is given inside the template array Minson will store an array-length based on the length of the input data.
It is more efficient to have a fixed array length:
var topThreeSwordsmen = ['Sasaki Kojiro', 'El Cid', 'Ito Ittosai Kagehisa'];
var template = ['varchar(255)', 'varchar(255)', 'varchar(255)'];
For other variable types the template is just a configuration string:
var inputVar = 1587123;
var template = 'int(32)';
Encoding
Supply the template and the data variables to the encode() function:
var encodedString = Minson.encode(template, inputVar);
Decoding
Supply the template and the encoded string to the decode() function:
var decodedVariable = Minson.decode(template, encodedString);
The template must be identical to the one supplied during encoding, therefore since it's possible your data structure will change, you should consider maintaining revisions of these templates in your project in order to access older encoded data.
Data Structures
The following data structures are supported by Minson templates.
If this isn't sufficient, and your data is serializable with JSON, you can use the json data type to include the data structure into a Minson encoded string.
object
Object
Objects hold key-value pairs and are the most commonly used structure. The template will define each key and the value of the key will be the template or configuration of the value for that key in the corresponding input data.
Example:
// Configure an object with a string and a number.
{
aString: 'varchar',
aNumber: 'int(32)',
}
// Also configure an object with a string and a number.
'object<aString: varchar, aNumber: int(32)>'
array
Array
Arrays hold a list of values. The template will define each array element as a template or configuration string for the data type or structure that will be provided in that array position in the corresponding input data. If only a single element is templated it will be assumed that the input data array can be of any size and each element will be of the same data type or data structure.
Examples:
// Configure an array that contains 2 values; a string and a number.
['varchar', 'int(32)']
// Also configure an array that contains 2 values; a string and a number.
'array(2)<varchar, int(32)>'
// Configure an array of unlimited length that only contains numbers.
['int(32)']
// Also configure an array of unlimited length that only contains numbers.
'array<int(32)>'
// Configure an array of unlimited length that contains any type of variable.
['']
// Also configure an array of unlimited length that contains any type of variable.
'array'
map
Map
A JavaScript Map() is an alternative to using an Object to hold key-value pairs.
Example:
// Configure a map with a string and a number.
new Map([['aString', 'varchar'], ['aNumber', 'int(32)']])
// Also configure a map with a string and a number.
'map<aString: varchar, aNumber: int(32)>'
set
Set
A Set is like an Array, but duplicate values are automatically omitted.
Examples:
// Configure a set of unlimited length that only contains numbers.
new Set(['int(32)'])
// Configure a set of 100 numbers.
'set(100)<int(32)>'
// Configure a set of 5 numbers using a Set() template. Notice the cheeky
// capitilzation and whitespace changes to avoid Set() omitting duplicates.
new Set(['int(32)', 'Int(32)', 'INt(32)', 'int (32)', 'int(32) '])
// Configure a set of 5 numbers using a Set() template and the config
// generator features. (See "Generating Configuration")
new Set([
Minson.config(Minson.type.INT, 32),
Minson.config(Minson.type.INT, 32),
Minson.config(Minson.type.INT, 32),
Minson.config(Minson.type.INT, 32),
Minson.config(Minson.type.INT, 32),
])
Data Types
In addition to data structures, the following is a list of supported data types. Your goal should be to choose the smallest representation of your data. If you're storing data from a form that allows a selection from a predefined list of options; use an enum over a varchar, and choose integer types with a smaller param if your expected values always fit within the value range.
| type(size/param) | Description | Value range | |----------------------------------- |----------------------------------------------------------------------------------------------------------------------------- |--------------------------------- | | bool | Boolean true or false | true & false | | bool(value1, value2) | 2-value list | value1 & value2 | | enum(n) | Multiple-choice integers | 0 to n-1 | | enum(val1, val2, val3, ...) | Multiple-choice list | Any listed value | | int(8) | 1-byte signed integer | -128 to 127 | | uint(8) | 1-byte unsigned integer | 0 to 255 | | int(16) | 2-byte signed integer | -32,768 to 32,767 | | uint(16) | 2-byte unsigned integer | 0 to 65,535 | | int(32) | 4-byte signed integer | 0 to 4,294,967,295 | | uint(32) | 4-byte unsigned integer | -2,147,483,648 to 2,147,483,647 | | float(32) | 4-byte floating point number | 1.2x10^-38 to 3.4x10^38 | | float(64) | 8-byte floating point number | 5.0x10^-324 to 1.8x10^308 | | bigint(64) | 8-byte BigInt() signed integer | -2^63 to 2^63-1 | | biguint(64) | 8-byte BigInt() unsigned integer | 0 to 2^64-1 | | varchar(255) | A string with 255 bytes or less | 0 to 255 bytes | | varchar | Long string | unlimited | | char | Character | 0 to 255 (by default) | | wchar | Wide character up to 4 bytes | 0 to 4,294,967,295 | | json | Any value serializable with JSON | unlimited |
bool
Boolean
bool(param)[default]
Param: Two values separated by comma (optional - defaults to true, false)
Perfect choice for storing binary options like on/off checkboxes or switches.
Will not store null or undefined - they will be coerced to false, if
you need that consider using 'enum(true, false, null, undefined)'
instead.
Examples:
// Configure true or false values (same as default params)
'bool(true, false)'
// Configure strings "on" or "off"
'bool("on", "off")'
enum
Enumerated
enum(param)[default]
Param: One or more values separated by commas, or a single integer (required)
Many configs or select/radio forms limit value choices to a predefined list, and this is an ideal type for encoding those values.
Examples:
// Configure enum to expect any of the values: 0, 1, 2, or 3
'enum(4)'
// Configure enum to expect any of the values: "red", "blue", or "green"
'enum("red", "blue", "green")'
int
Signed integer
int(size)[default]
Size: 8, 16, or 32 (required)
Signed integers allow encoding negative integers. If you do not need to allow
for negative integers it may be preferable to use uint(size)
(Unsigned
integer) instead because it allows for larger values.
Example:
// Configure integer with values from -32,768 to 32,767
'int(16)'
uint
Unsigned integer
uint(size)[default]
Size: 8, 16, or 32 (required)
When choosing an integer size (i.e. the param) refer to the table above for the data ranges and choose the smallest size you could possibly need.
Example:
// Configure integer with values from 0 to 255
'uint(8)'
float
Floating-point number
float(size)[default]
Size: 32 or 64 (required)
For number values that have, or may need to have, a decimal point. Likely conforms to IEEE Standard for Floating-Point Arithmetic (IEEE 754):
- 32: Single Precision; Sign Bits = 1, Exponent Bits = 8, Significand Bits = 23
- 64: Double Precision; Sign Bits = 1, Exponent Bits = 11, Significand Bits = 52
Example:
// Configure a floating-point number
'float(32)'
bigint
Signed BigInt
bigint(size)[default]
Size: 64 (required)
For JavaScript BigInt numbers.
Example:
// Configure a BigInt number
'bigint(64)'
biguint
Unsigned BigInt
bigint(size)[default]
Size: 64 (required)
For JavaScript BigInt numbers.
Example:
// Configure an unsigned BigInt number
'biguint(64)'
varchar
Variable-sized character string
varchar(size)[default]{charset}
Size: 255 (optional)
Since many strings from input and generated functions are of a limited size, use the param 255 if the length is known to be 255 bytes or less. Since characters can be up to four bytes long, it should be reasonable to assume a string of 63 unicode characters or less is safe to use with the 255 param.
If the param is not supplied, a string of any length can be used.
- This type supports an optional charset (See "Custom Charset")
Example:
// Configure a varchar
'varchar'
char
Character
char[default]{charset}
Used for strings of exactly one character, which by default is 8-bits (1 byte) in length.
- This type supports an optional charset and it is possible to use it to encode values of less than one byte (See "Custom Charset")
Example:
// Configure a char
'char'
wchar
Wide character
wchar[default]
Param: (not applicable)
Used for strings of exactly one multibyte character, which may be up to 4-bytes in length.
Example:
// Configure a wide char
'wchar'
json
JSON
json[default]
Param: (not applicable)
Serializes any value with JSON and stores it as a varchar. This is less efficient than selecting another type, and carries the same limitations as JSON.
This is useful for encoding/decoding an object with unknown keys.
Example:
// Configure a json serializable variable
'json'
Typed Arrays
Minson can also handle variables of TypedArray types.
These are provided to Minson's templates not like Data Structures, but like Data Types, and will be exploded into their Array equivalent.
For example specifying a configuration string of 'int8array'
will convert it
to ['int(8)']
and specifying int8array(3)[5]
will convert it to
['int(8)[5]', 'int(8)[5]', 'int(8)[5]']
. The correct TypedArray type will be
restored during decoding.
Variables of unknown or mixed type
This isn't an ideal usage of Minson, but you can supply an empty configuration string:
// An object containing a property of unknown type:
var template = {
property: '',
}
// An array with mixed values:
var template = [''];
// A scalar variable of unknown type:
var template = '';
Minson will handle the value reasonably well if it is a scalar or array value, and objects will be handled using the json type (as Minson won't be templated to handle that object's keys).
Default Values
If your templated value is missing from your input data (i.e. it is undefined), you can supply a default value in the template by appending square brackets.
// Set the integer to 1 if it is missing.
int(32)[1]
The default value is stored during encoding, and changing the templated default value will have no effect during decoding.
Custom Charset
The varchar and char types support an optional custom charset.
Many string values contain a limited set of characters, for example a machine
key might only contain characters A-Za-z0-9 (i.e alphabet letters both upper
case and lower case as well as numeric digits), a string based on a
hexadecimal hash only contains the characters 0-9A-F, and some character sets
like the US-ASCII and GSM-7 (for SMS text) are limited to 128 characters.
Minson can use this to encode a smaller amount of data for varchar and char
types.
// Only use characters A-Za-z0-9
// That's 62 chars, Minson will encode 25% less data.
'varchar{ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789}'
// Only use characters 0-9A-F
// That's 16 chars, Minson will encode 50% less data.
'varchar{0123456789ABCDEF}'
// Only allow DNA nucleotides ACGT
// That's 4 chars, Minson will encode 75% less data.
'char{ACGT}'
Custom charsets are supplied as a list of characters between braces in the templated configuration string. The order of the list is important and should be consistent between encode() and decode().
Encoding Format
By default Minson escapes newline characters from its output, this can increase the length of the output slightly.
The entry functions for Minson allow an optional third parameter to set the
output format: Minson.encode(config, input, format)
and
Minson.decode(config, input, format)
.
Valid format values are the following strings:
- 'noescape' to allow shorter multiline output e.g.
Minson.encode(config, input, 'noescape')
- 'base64' for longer output without gibberish e.g.
Minson.encode(config, input, 'base64')
- 'bits' for an array of bit strings e.g.
Minson.encode(config, input, 'bits')
Just remember to set this the same way for encode() and decode().
Generating Configuration
An alternative to typing 'type(param)[default]{charset}<children>'
strings is to
generate configuration objects directly using Minson.config()
.
var cfgStr = Minson.config(Minson.type.TYPE, param, default, charset, children);
It may be preferable to use this in order to catch configuration issues early.
Allowed values for TYPE are: OBJECT, ARRAY, MAP, SET, BOOL, ENUM, INT, UINT, FLOAT, BIGINT, BIGUINT, CHAR, WCHAR, VARCHAR, JSON, INT8ARRAY, UINT8ARRAY, UINT8CLAMPEDARRAY, INT16ARRAY, UINT16ARRAY, INT32ARRAY, UINT32ARRAY, FLOAT32ARRAY, FLOAT64ARRAY, BIGINT64ARRAY, BIGUINT64ARRAY
Example:
// Configure an enum.
var cfgStr = Minson.config(Minson.type.ENUM, ['one', 'two', 'three'], 'three');
Notice how it's possible to supply an actual array to the param value. This also applies to the default value.
You can also supply an equivalent object like so:
var cfgStr = Minson.config({
type: Minson.type.ENUM,
param: ['one', 'two', 'three'],
default: 'three',
});
The param key can also be called size when that feels appropriate:
var cfgStr = Minson.config({
type: Minson.type.INT,
size: 32,
});
You can also use Minson.charset.CHARSET
to supply a predefined charset.
Available values for CHARSET are:
ALPHANUMERIC, NUMERIC, HEXADECIMAL, ALPHA, ALPHAUPPER, ALPHALOWER, SYMBOLS
You can concatenate multiple charsets or perform other string operations on them.
Function Aliases
If you prefer the terminology, Minson.encode() is aliased with Minson.stringify() and Minson.serialize(). Similarly Minson.decode() is aliased with Minson.parse() and Minson.unserialize().
Unexpected Values
If an object or map contains keys that are not configured in the template, they will be ignored, not included in the encoded output, and not present in the decoded variable. If you anticipate unexpected keys you should instead use the json data type.
It may be possible to handle this functionality in the future.
Invalid Values
There is currently no detection of invalid values, you will experience undefined behaviour if your values don't match your template configuration - including malforming your data.
It may be possible to handle this functionality in the future.
Tests
Tests are available in the github repo and can be executed with npm test
.
To check coverage you have to install istanbul globally:
npm install istanbul -g
and then execute: npm run coverage
A coverage summary will be displayed and a full coverage report will appear in the /coverage directory.
Contributing
https://github.com/braksator/minson
In lieu of a formal style guide, take care to maintain the existing coding style. Add tests for coverage and explicitly test bugs and features.