destruct-js

v0.2.13

Published

a year ago

Declarative specifications for parsing binary data from Buffers

Downloads

0High
0Medium
0Low

revbingo

nozdaizy

destruct-js

destruct-js is a Javascript library for reading and writing binary data from Buffers using a declarative specification, inspired by construct-js.

Usage

A quick example:

// Reading
const spec = new Spec();        // big endian by default

spec.field('count', UInt32)            // 4 byte unsigned integer
    .field('temperature', UInt8,       // 1 byte unsigned...
          { then: (f) => (f - 32) * (5/9)}) // ...which we convert from Farenheit to Celsius
    .skip(1)                                //skip a byte
    .field('stationId', Text, { size: 3 })  // 3 bytes of text, utf8 by default

const result = spec.read(Buffer.from([0xFF, 0x30, 0x19, 0xA0, 0xD4, 0xFF, 0x42, 0x48, 0x36]));

expect(result.count).toBe(4281342368);
expect(result.temperature).toBe(100);
expect(result.stationId).toBe('BH6');

We can take (almost) the same spec, and write the result back to get the same buffer as we started with:

// Writing
const spec = new Spec();        // big endian by default

spec.field('count', UInt32)            // 4 byte unsigned integer
    .field('temperature', UInt8)      // 1 byte unsigned - no conversion when writing (yet...)
    .skip(1)                                //skip a byte
    .field('stationId', Text, { size: 3 })  // 3 bytes of text, utf8 by default

const result = spec.write({ count: 4281342368, temperature: 212, stationId: 'BH6' });

expect(result.toString('hex')).toBe('ff3019a0d4ff424836')

Specifications are declared using the Spec object. To get started, create a Spec. You can pass options in the constructor.

const leSpec = new Spec({ mode: Mode.LE, lenient: true }); // little endian
const beSpec = new Spec();        // big endian, non-lenient is the default

Each field in the buffer is specified in order using the field method. Each field has a name, and a data type, and some options if you wish to specify them. Different options are relevant to different data types, as specified below. When you call spec.read(buffer), the buffer is read "left to right", filling a JSON object with field names as keys, which is returned to you once it's finished. Your spec does not need to read the whole buffer if you don't need to. By default, you will get an error if you try and read beyond the end of the buffer. If you specify lenient mode in the options, any attempt to read once the end of the buffer has been read returns undefined.

Writing works in a very similar way - the named fields are lookup up in the object that you pass, and written to the buffer in order. With some exceptions (noted below), the read and write operations should be symmetric, therefore the output of a read operation could be passed to a write operation on the same spec and give you back the original buffer.

Numeric Data Types

Numeric data types will read a number from the buffer with the specified size. All the numeric data types you would expect to see are supported, and if you're reading this probably do not need explanation - Int8, UInt8, Int16, UInt16, Int32, UInt32, Float, Double.

Numeric types also support an additional dp configuration, which limits the number of decimal places, either on values directly from the Buffer (for Float or Double types) or produced from a then option. All numeric types can also take a mode configuration to read a single field in a different mode to the rest of the buffer.

const result: any =
  new Spec()
    .field('count3dp', Float, { dp: 3 })
    .field('count1dp', Float, { dp: 1 })
    .read(Buffer.from([0x40, 0x49, 0x0F, 0xD0, 0x40, 0x49, 0x0F, 0xD0]));

expect(result.count3dp).toBe(3.142);
expect(result.count1dp).toBe(3.1);

Bool

The Bool type reads a single bit from the buffer, as a boolean value

const result =
    new Spec()
      .field('enabled', Bool)
      .field('ledOff', Bool)
      .field('releaseTheHounds', Bool)
      .read(Buffer.from([0xA0]));

expect(result.enabled).toBe(true);
expect(result.ledOff).toBe(false);
expect(result.releaseTheHounds).toBe(true);

Bit, Bits[2-16]

The Bit type reads a single bit from the buffer, as a 0 or 1. The types Bits2 through to Bits16 read the corresponding number of bits and returns them as unsigned integers. These are aliases for the Bits type with a size option, that can be used to read arbitrary sizes. Note that this reads across byte boundaries where necessary. This means that, for example, Bits8 is not the same as UInt8, which will throw an error if trying to read/write when not aligned to a byte boundary.

const result =
  new Spec()
    .field('enabled', Bit)
    .field('mode', Bits2)
    .field('frequency', Bits4)
    .field('days', Bits, { size: 5})  // same as Bits5
    .read(Buffer.from([0xD3, 0x3A]));

expect(result.enabled).toBe(1);
expect(result.mode).toBe(2);
expect(result.frequency).toBe(9);
expect(result.days).toBe(19);

Bytes & Text

The Bytes data type will simply return a Buffer. This is useful if you need to deal with groups of bytes in odd or larger numbers. You can specify the number of bytes to read with the size option.

const result =
  new Spec()
    .field('nameBytes', Bytes, { size: 3 })
    .read(Buffer.from([0x61, 0x62, 0x63, 0x64]));

expect(result.nameBytes.toString('hex')).toBe('616263');

Text is a specific implementation of Bytes that will handle encoding as well. You can specify the encoding with the encoding option, default is utf-8. Note that size is still the number of bytes to read/write, not code points.

const result =
  new Spec()
    .field('name', Text, { size: 3, encoding: 'utf-8' })
    .field('multiByte', Text, { size: 3, encoding: 'utf-8' })
    .read(Buffer.from([0x62, 0x6f, 0x62, 0xE3, 0x83, 0xA6]));

expect(result.name).toBe('bob');
expect(result.multiByte).toBe('ユ');

If the size is determined dynamically, you can pass a function to the size parameter that will resolve a value from the result map

const result =
  new Spec()
    .field('nameSize', UInt8)
    .field('name', Text, { size: r => r.nameSize })
    .read(Buffer.from([0x03, 0x62, 0x6f, 0x62, 0x65, 0x66, 0x67]));

expect(result.name).toBe('bob');

or a terminator character:

const result =
  new Spec()
    .field('name', Text, { terminator: 0x00 })
    .read(Buffer.from([0x62, 0x6f, 0x62, 0x00, 0x31, 0x32, 0x33]));

expect(result.name).toBe('bob');

or by default it will run to the end of the buffer:

const result =
  new Spec()
    .field('name', Text)
    .read(Buffer.from([0x62, 0x6f, 0x62, 0x31, 0x32, 0x33]));

expect(result.name).toBe('bob123');

When writing, the size option will truncate the text to the appropriate size (again, in bytes, not code points), and the terminator will be added to the end.

const result =
  new Spec()
    .field('name', Text, { size: 3, terminator: 0x00 })
    .write({ name: 'bobalobacus' })

expect(result.toString('hex')).toBe('626f6200');

Literals

If necessary, you can add a literal string, number or boolean value by specifying it as the second argument to a .field or .store call. A literal value does not consume any data from the Buffer.

const result =
  new Spec()
    .field('type', 'install')
    .store('pi', 3.14)
    .read(Buffer.from([0x00]))

expect(result.type).toBe('install')

Literal fields are ignored when writing.

Other options

These options apply to any data type, in addition to the type specific options noted above.

then: (any) => any - All data types support a then option to do some post processing on the value when reading. The then option should be a function that takes the value read from the buffer as input, and outputs some other value, which may or may not be of the same type. The equivalent of then when writing is before.

before: (any) => any - All data types support a before option to do some pre processing on the value when writing. The before option should be a function that takes the value read from the data object, and outputs a value, which may or may not be of the same type, to write to the buffer.

const result =
  new Spec()
    .field('numericText', Text, { size: 3, then: parseInt })
    .field('temperature', UInt8, { then: (f) => (f - 32) * (5/9), before: (c) => (c * 9/5) + 32 })
    .read(Buffer.from([0x31, 0x32, 0x33, 0xD4]))

expect(result.numericText).toBe(123);
expect(result.temperature).toBe(100);

shouldBe: (string | number | boolean) - All data types support a shouldBe option, that can be used to assert that a particular value should be fixed. For example, you might use this to check that a particular delimiter is present. If the value read from the buffer does not match the expected value, an Error will be thrown. shouldBe also applies when writing, and validates that the value in the object passed for writing matches the expected value.

const result =
  new Spec()
    .field('javaClassIdentifier', UInt32, { shouldBe: 0xCAFEBABE })
    .read(Buffer.from([0xCA, 0xFE, 0xBA, 0xBE]))

Payload Specs

The Spec object contains a number of methods that allow you to richly state the specification of your payload. As well as specifying fields using .field(), you can use other instructions to modify behaviour of the parser.

Variable storage

Internally, the spec maintains a result object, and also another map of intermediate variables that may be needed in later parsing but should not appear in the final result.

store(name: string, type: DataType) - fetches a value from the buffer in the same way as .field(), but stores the value internally instead of adding to the final output. .store() can be used in combination with .derive() to use values in later calculations. When writing, store instructions apply in the same way as field instructions - their value is written to the buffer.

const result =
  new Spec()
    .field('firstByte', UInt8)
    .store('ignoreMe', UInt8)
    .read(Buffer.from([0xFF, 0x01]));

expect(result.firstByte).toBe(255);
expect(result.ignoreMe).toBeUndefined();

derive(name: string, valueFunction: (r: any) => any) - calculates a value to add to the result, potentially using values already read from the buffer. The valueFunction will be passed an object containing all field and stored values that have been read to this point. Note that derive instructions are ignored during writing, as it is not possible to derive the inputs from the original output.

const result =
  new Spec()
    .field('count', Int8)
    .derive('doubleCount', (r) => r.count * 2)
    .read(Buffer.from([0x02]))

  expect(result.count).toBe(2);
  expect(result.doubleCount).toBe(4);

Control Flow

You can conditionally parse parts of the buffer using some control statements that mirror standard JS.

include(spec: Spec) - executes the specified Spec, and stores results at (or reads data from) the current level in the result object. All state in the original spec (variables, position etc.) is passed to the new spec, and control and state is returned to the original spec once the new spec (and any specs executed within that) are completed.

group(name: string, spec: Spec) - executes the specified Spec, but stores results under (or reads data from) the specified key in the result object.

const header = new Spec()
  .field('version', UInt8)
  .field('filesize', UInt16)

const dataBlock = new Spec()
  .field('identifier', UInt8)
  .field('dataLength', UInt8)

const mainSpec = new Spec()
  .group('header', header)
  .group('data', dataBlock)

const result = mainSpec.read(Buffer.from([0x02, 0x00, 0x80, 0xDE, 0x03]))

expect(result).toEqual({
  header: {
    version: 2,
    fileSize: 128
  },
  data: {
    identifier: 0xDE,
    dataLength: 3
  }
})

if((r: any) => boolean, Spec) - executes the specified Spec if the function evaluates to true. The function is passed the current result object.

switch((r: any) => string | number | boolean, {[k:string]: Spec}) - looks up the value returned from the function in a map, and executes the associated spec. Note that the function may return any primitive, but it will be converted to a string, and the keys of the map must be strings. If the value returned from the function is not found in the map, the option with a key of default will be used. No error will be thrown if neither the value nor 'default' exist in the map, the spec will simply continue from the next instruction.

loop(string, number | ((r:any) => number) | null, Spec) - repeats the specified Spec the given number of times, either a literal number, a function that returns a number, or null to indicate that the loop should continue until the end of the buffer. Specs are returned as a nested entry in the result under the given name. When writing, loops can also be used, and expects to find data in the same nested structure as would be produced when reading.

For example

// Reading
 const loopSpec =
      new Spec()
        .loop('level1', 2, new Spec()
          .field('l1Size', UInt8)
          .loop('level2', (r) => r.l1Size, new Spec()
            .field('l2Value', UInt8))
        )

    const readResult = loopSpec.read(Buffer.from([0x02, 0xFF, 0xFE, 0x03, 0x10, 0x11, 0x12]));

    expect(readResult).toEqual({
      level1: [  // results from the loop are in a array with the specified key
        {
          l1Size: 2,
          level2: [
            { l2Value: 255 },
            { l2Value: 254 },
          ]
        },
        {
          l1Size: 3,
          level2: [
            { l2Value: 16 },
            { l2Value: 17 },
            { l2Value: 18 },
          ]
        }
      ]
    })

  // write it back
    const writeResult = loopSpec.write(readResult);
    expect(writeResult.toString('hex')).toBe('02fffe03101112');

Position control

When parsing the buffer, you may need to explicitly set the current position

skip(bytes: number | NumericDataType) - skips the specified number of bytes, or the size of the specified numeric data type. When writing, skipped bytes will be filled with zeroes.

const spec =
  new Spec()
    .field('firstByte', UInt8)
    .skip(UInt16)               // same as .skip(2)
    .field('lastByte', UInt8)

const result = spec.read(Buffer.from([0xFF, 0xAB, 0xCD, 0x01]));

expect(result.firstByte).toBe(255);
expect(result.lastByte).toBe(1);

const writeResult = spec.read(result);

expect(writeResult.toString('hex')).toBe('ff000001'); // Note that the 0xABCD bytes are *not* retained

pad() - moves the buffer position to the next byte boundary, if you've read or written Bits that are not a multiple of 8. Note that if you try and read or write a byte type (e.g. Int8, UInt16) when the buffer is not at a byte boundary, an Error will be thrown. If you need to write bytes that are not aligned to boundaries, you will need to use e.g. Bits8 or Bits16.

const result =
  new Spec()
    .field('enabled', Bit)
    .pad()
    .field('count', Int8)
    .read(Buffer.from([0x80, 0x02]))

expect(result.enabled).toBe(true);
expect(result.count).toBe(2);

endianness(mode: Mode) - switches the buffer to reading/writing the specified endianness from this point i.e. it does not apply to previously read values.

const result =
  new Spec({ mode: Mode.BE })
    .field('countBE', UInt16)
    .endianness(Mode.LE)
    .field('countLE', UInt16)
    .read(Buffer.from([0xFF, 0x30, 0x30, 0xFF, 0xFF, 0x30]))

expect(result.countBE).toBe(65328);
expect(result.countLE).toBe(65328);

Extras

tap((Buffer, ReaderState) => void) - use .tap() to perform some action at some point in the parsing/writing, such as printing something to the console for debug purposes. The buffer is a PosBuffer which has an offset property that you can use to check the current position in the buffer. The offset.bytes property gives the position in bytes, and the offset.bits property gives a bit offset within the current byte, if Bits or Bools have been read. The ReaderState contains the current result i.e. any data from a field or derive statement, and also storedVars for any store operations i.e. data that has been fetched/calculated but will not appear in the final result.

Note that both the Buffer and ReaderState are mutable - by changing them you may either wield great power or wreak great havoc. For example, you could insert values manually into the result map when reading, or to the buffer when writing.

const readSpec = new Spec()
  .field('one', UInt8)
  .tap((buffer, readerState) => readerState.results.onePointFive = 1.5)
  .field('two', UInt8)

const readResult = readSpec.read(Buffer.from([0x01, 0x02]));

expect(readResult).toEqual({
  one: 1,
  onePointFive: 1.5,
  two: 2
})

const writeSpec = new Spec()
  .field('one', UInt8)
  .tap((buffer, readerState) => buffer.write(UInt8, 255))
  .field('two', UInt8)

const writeResult = spec.write({ one: 1, two: 2);

expect(writeResult.toString('hex')).toBe('01FF02')

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

destruct-js

Usage

Numeric Data Types

Bool

Bit, Bits[2-16]

Bytes & Text

Literals

Other options

Payload Specs

Variable storage

Control Flow

Position control

Extras