npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

binary-object

v0.4.0

Published

Encode json objects into a binary format. Inspired by msgpack. With reduces memory usage.

Downloads

14

Readme

binary-object

Encode json objects into a binary format. Inspired by msgpack. With reduces memory usage.

npm Package Version

Features

  • Encode any value to binary format (and decode back)

  • Memory-efficient

  • Auto detect object schema to make the binary format more compact (optional)

  • This library is highly composable, it can be extended/reused to support different encoding schema and input/output channel

  • Support varies encode/decode pipes:

    • [x] Binary Object
    • [x] Binary JSON
    • [x] Object Schema (store object keys only once, similar to csv, but nested)
    • [x] compress-json
    • [x] msgpack
  • Support varies input/output channel:

    • [ ] Buffer
    • [x] File
    • [ ] Stream (e.g. fs / net stream)
    • [x] Callback (for producer / consumer pattern)
    • [x] Array (for in-memory mock test)

Wide range of javascript data type are supported:

  • string
  • number
  • bigint
  • boolean
  • Buffer
  • Array
  • Map
  • Set
  • Date
  • object
  • symbol
  • function
  • undefined
  • null

Why not MsgPack?

MsgPack cannot reclaim the memory usage.

When I use MsgPack to encode ~211k objects, and write them to a file (via fs.writeSync, fs.appendFileSync or fs.createWriteStream), the node runs out of memory.

In the test, each object comes from LMDB, a sync-mode embedded database. The key and value of each object packed and written to file separately. It doesn't cause out of memory error if I load all the objects into memory, then pack them as a single array (then write to file) but this batching approach doesn't work for long stream of data.

I tried to use setTimeout to slow down the writing, and even explicitly call process.gc() but the problem persist.

Why not BON?

BON does not support parsing from file / stream.

The current implementation of BON requires the binary data to be passed into the decode() or decode_all() function in complete.

Also, the data schema of BON does not specify the length of list and string, which is compact in turns of the resulting binary data. However, lacking this information means the decoding process cannot be preciously fetch the right amount of bytes from the file / readable stream.

As a result, BON cannot support continuous decoding from a large file / long stream.

How this library works?

This library re-use the buffer when encoding across multiple calls. Effective like the object pooling but for buffer.

Also, the length of each data chunk is deterministic from the header (data type and length). Therefore, the decoder knows exactly how many bytes should be read to process.

In addition, some source type support *iterator() generator method which help to decode the binary data iteratively.

This library implement the I/O and decode/encode as pipes, namely Sink and Source.

Multiple sinks or sources can be stacked together in flexible manner. e.g.

const sink = new SchemaSink(new BinaryJsonSink(FileSink.fromFile('db.log')))

const source = new SchemaSource(new BinaryJsonSource(FileSource.fromFile('db.log')))

Or you can use some pre-defined factory functions to construct the pipes

const sink = BinaryObjectFileSink.fromFile('db.log')

const source = BinaryJsonFileSource.fromFile('db.log')

Does this work?

The correctness is tested and passed.

The benchmarking is not done.

Combination & Performance

Sample Data

266430 sample json data crawled from online forum.

Total size: 843M

The objects have consistent shape.

Some data are duplicated, e.g. user name, and some common comments.

Benchmarks

The sample data are read line by line, instead of loading into memory at once to avoid "Out of Memory Error".

high-write-speed: data > json > raw-line-file | 843M | 38.07s write | 38.74s read

high-read-speed: data > binary-json > file | 843M | 42.95s write | 30.02s read

-high-read-speed-disk-space-efficient: data > schema > msgpack > file | 640M | 51.80s write | 18.34s read

more-disk-space-efficient: data > unique-value > line-file | 506M | 151.64s write | 59.26s read (estimated from 50% sample to avoid Out of Memory Error)

LICENSE

BSD-2-Clause (Free Open Sourced Software)