ozonep-bson-to-json
v1.3.4
Published
Direct BSON to JSON transcoder.
Downloads
39
Readme
Directly and quickly converts a BSON buffer to a JSON string stored in a Buffer. Useful for sending MongoDB database query results to a client over JSON+HTTP.
Benchmark with a ~2500-element array of medium objects (9MB BSON):
| Method | Time (ms) |
| ------ | --------: |
| JSON.stringify(BSON.deserialize(arr))
1 | 226.0 |
| this, JS | 39.7 |
| this, portable C++ | 20.6 |
| this, SSE2 | 15.2 |
| this, SSE4.2 | 11.5 |
| this, AVX2 | 10.6 |
1 BSON.deserialize
is the official MongoDB js-bson implementation.
Installation
The C++ implementations require a C++ compiler. See instructions here. If you do not have a C++ compiler, the slower JS version will be used.
yarn add zbjornson/bson-to-json
# or
npm install zbjornson/bson-to-json
Usage
bsonToJson
const {bsonToJson} = require("bson-to-json"); bsonToJson(bson: Uint8Array, isArray?: boolean = true): Buffer // (note that Buffers extend Uint8Arrays, so `bson` can be a Buffer)
Transcodes a BSON document to a JSON string stored in a Buffer.
isArray
specifies if the input is an array or not. BSON doesn't differentiate
between arrays and objects at the top level, so this must be provided if bson
is an array.
The output should be identical to JSON.stringify(BSON.deserialize(v))
, with
two exceptions:
- This module writes full-precision (64-bit signed) BSON Longs to the JSON buffer. This is valid because JSON does not specify a maximum numeric precision, but js-bson instead writes an object with low and high bits.
- This module does more/better input bounds checking than js-bson, so this
module may throw different errors. (js-bson seems to rely, intentionally or
not, on indexing past the end of a typed array returning
undefined
.)
send
const {send} = require("bson-to-json"); send(cursor: MongoDbCursor, ostr: Stream.Writable): Promise<void>
Efficiently sends the contents of a MongoDB cursor to a writable stream (e.g. an HTTP response). The returned Promise resolves when the cursor is drained, or rejects in case of an error.
Example usage in an HTTP handler
const {send} = require("bson-to-json");
async function (req, res) {
const cursor = await db.collection("mycol").find({name: "Zach"}, {raw: true});
res.setHeader("Content-Type", "application/json");
await send(cursor, res);
}
This is the fastest way to transfer results from MongoDB to a client. MongoDB's
cursor.forEach
or for await (const doc of cursor)
both have much higher CPU
and memory overhead.
ISE
const {ISE} = require("bson-to-json"); ISE: string
A constant indicating what instruction set extension was used (based on your
CPU's available features). One of "AVX512"
, "AVX2"
, "SSE4.2"
, "SSE2"
,
"Baseline"
(portable C) or "JavaScript"
.
Performance notes
Major reasons it's fast
- Direct UTF8 to JSON-escaped string transcoding.
- No waste temporary objects created for the GC to clean up.
- SSE2, SSE4.2 or AVX2-accelerated JSON string escaping.
- AVX2-accelerated ObjectId hex string encoding, using the technique from zbjornson/fast-hex.
- Fast integer encoding, using the method from
fmtlib/fmt
. - Fast double encoding, using the same double-conversion library used in v8.
- Skips decoding array keys (which BSON stores as ASCII numbers) and instead advances by the known number of bytes in the key.
- The
send
method has a tight call stack and avoids allocating a Promise for each document (compared tofor await...of
).
Benchmarks by BSON type (ops/sec):
| Type | js-bson | this, JS | this, CPP (AVX2) | | ---- | ---: | ---: | ---: | | long | 1,760 | 1,236 | 28,031 | int | 1,503 | 1,371 | 17,264 | ObjectId | 1,048 | 13,322 | 37,079 | date | 445 | 663 | 10,686 | number | 730 | 1,228 | 1,929 | boolean | 444 | 4,839 | 9,283 | null | 482 | 7,487 | 14,709 | string<len=1000, esc=0.00>1 | 12,304 | 781 | 55,502 | string<len=1000, esc=0.01> | 12,720 | 748 | 56,145 | string<len=1000, esc=0.05> | 12,320 | 756 | 43,867
1String transcoding performance depends on the length of the string
(len
) and the number of characters that must be escaped in the JSON output
(esc
, a fraction from 0 to 1).
Future Plans
- Iterator-based (streaming) interface. It's mostly working in the C++ version,
but crashes on gc. See documentation in the
iterator
branch. I also experimented with C++20 coroutines in thecoroutines
branch. - Drop
long
dependency when Node 10 support is dropped. - Consider adding an option to prepend a comma to the output so it can be used with MongoDB cursors more efficiently.