npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@vascosantos/carbites

v0.8.0

Published

Chunking for CAR files. Split a single CAR into multiple CARs.

Downloads

3

Readme

carbites

Build dependencies Status JavaScript Style Guide npm bundle size

Chunking for CAR files. Split a single CAR into multiple CARs.

Install

npm install carbites

Usage

Carbites supports 3 different strategies:

  1. Simple (default) - fast but naive, only the first CAR output has a root CID, subsequent CARs have a placeholder "empty" CID.
  2. Rooted - like simple, but creates a custom root node to ensure all blocks in a CAR are referenced.
  3. Treewalk - walks the DAG to pack sub-graphs into each CAR file that is output. Every CAR has the same root CID, but contains a different portion of the DAG.

Simple

import { CarSplitter } from 'carbites'
import { CarReader } from '@ipld/car'
import fs from 'fs'

const bigCar = await CarReader.fromIterable(fs.createReadStream('/path/to/big.car'))
const targetSize = 1024 * 1024 * 100 // chunk to ~100MB CARs
const splitter = new CarSplitter(bigCar, targetSize) // (simple strategy)

for await (const car of splitter.cars()) {
  // Each `car` is an AsyncIterable<Uint8Array>
}

⚠️ Note: The first CAR output has roots in the header, subsequent CARs have an empty root CID bafkqaaa as recommended.

Rooted

Instead of an empty CID, carbites can generate a special root node for each split CAR that references all the blocks and the original roots (only in the first CAR). To do this, use the RootedCarSplitter constructor. When reading/extracting data from the CARs, the root node should be discarded.

import { RootedCarSplitter } from 'carbites/rooted'
import { CarReader } from '@ipld/car/reader'
import * as dagCbor from '@ipld/dag-cbor'
import fs from 'fs'

const bigCar = await CarReader.fromIterable(fs.createReadStream('/path/to/big.car'))
const targetSize = 1024 * 1024 * 100 // chunk to ~100MB CARs
const splitter = new RootedCarSplitter(bigCar, targetSize)

const cars = splitter.cars()

// Every CAR has a single root - a CBOR node that is an tuple of `/carbites/1`,
// an array of root CIDs and an array of block CIDs.
// e.g. ['/carbites/1', ['bafkroot'], ['bafy1', 'bafy2']]

const { done, value: car } = await cars.next()
const reader = await CarReader.fromIterable(car)
const rootCids = await reader.getRoots()
const rootNode = dagCbor.decode(await reader.get(rootCids[0]))

console.log(rootNode[0]) // /carbites/1
console.log(rootNode[1]) // Root CIDs (only in first CAR)
/*
[
  CID(bafybeictvyf6polqzgop3jt32owubfmsg3kl226omqrfte4eyidubc4rpq)
]
*/
console.log(rootNode[2]) // Block CIDs (all blocks in this CAR)
/*
[
  CID(bafybeictvyf6polqzgop3jt32owubfmsg3kl226omqrfte4eyidubc4rpq),
  CID(bafyreihcsxqhd6agqpboc3wrlvpy5bwuxctv5upicdnt3u2wojv4exxl24),
  CID(bafyreiasq7d2ihbqm5xvhjjzlmzsensuadrpmpt2tkjsuwq42xpa34qevu)
]
*/

The root node is limited to 4MB in size (the largest message IPFS will bitswap). Depending on the settings used to construct the DAG in the CAR, this may mean a split CAR size limit of around 30GiB.

Treewalk

Every CAR file has the same root CID but a different portion of the DAG. The DAG is traversed from the root node and each block is decoded and links extracted in order to determine which sub-graph to include in each CAR.

import { TreewalkCarSplitter } from 'carbites/treewalk'
import { CarReader } from '@ipld/car/reader'
import * as dagCbor from '@ipld/dag-cbor'
import fs from 'fs'

const bigCar = await CarReader.fromIterable(fs.createReadStream('/path/to/big.car'))
const [rootCid] = await bigCar.getRoots()
const targetSize = 1024 * 1024 * 100 // chunk to ~100MB CARs
const splitter = new TreewalkCarSplitter(bigCar, targetSize)

for await (const car of splitter.cars()) {
  // Each `car` is an AsyncIterable<Uint8Array>
  const reader = await CarReader.fromIterable(car)
  const [splitCarRootCid] = await reader.getRoots()
  console.assert(rootCid.equals(splitCarRootCid)) // all cars will have the same root
}

CLI

npm i -g carbites

# Split a big CAR into many smaller CARs
carbites split big.car --size 100MB --strategy simple # (default size & strategy)

# Join many split CARs back into a single CAR.
carbites join big-0.car big-1.car ...
# Note: not a tool for joining arbitrary CARs together! The split CARs MUST
# belong to the same CAR and big-0.car should be the first argument.

API

class CarSplitter

Split a CAR file into several smaller CAR files.

Import in the browser:

import { CarSplitter } from 'https://cdn.skypack.dev/carbites'

Import in Node.js:

import { CarSplitter } from 'carbites'

Note: This is an alias of SimpleCarSplitter - the default strategy for splitting CARs.

constructor(car: CarReader, targetSize: number)

Create a new CarSplitter for the passed CAR file, aiming to generate CARs of around targetSize bytes in size.

cars(): AsyncGenerator<AsyncIterable<Uint8Array> & RootsReader>

Split the CAR file and create multiple smaller CAR files. Returns an AsyncGenerator that yields the split CAR files (of type AsyncIterable<Uint8Array>).

The CAR files output also implement the RootsReader interface from @ipld/car which means you can call getRoots(): Promise<CID[]> to obtain the root CIDs.

static async fromBlob(blob: Blob, targetSize: number): CarSplitter

Convenience function to create a new CarSplitter from a blob of CAR file data.

static async fromIterable(iterable: AsyncIterable<Uint8Array>, targetSize: number): CarSplitter

Convenience function to create a new CarSplitter from an AsyncIterable<Uint8Array> of CAR file data.

class CarJoiner

Join together split CAR files into a single big CAR.

Import in the browser:

import { CarJoiner } from 'https://cdn.skypack.dev/carbites'

Import in Node.js:

import { CarJoiner } from 'carbites'

Note: This is an alias of SimpleCarJoiner - a joiner for the the default CAR splitting strategy.

constructor(cars: Iterable<CarReader>)

Create a new CarJoiner for joining the passed CAR files together.

car(): AsyncGenerator<Uint8Array>

Join the CAR files together and return the joined CAR.

class RootedCarSplitter

Split a CAR file into several smaller CAR files ensuring every CAR file contains a single root node that references all the blocks and the original roots (only in the first CAR). When reading/extracting data from the CARs, the root node should be discarded.

Import in the browser:

import { RootedCarSplitter } from 'https://cdn.skypack.dev/carbites/rooted'

Import in Node.js:

import { RootedCarSplitter } from 'carbites/rooted'

The API is the same as for CarSplitter.

Root Node Format

The root node is a dag-cbor node that is a tuple of the string /carbites/1, an array of root CIDs (only seen in first CAR) and an array of block CIDs (all the blocks in the CAR). e.g. ['/carbites/1', ['bafkroot'], ['bafy1', 'bafy2']].

Note: The root node is limited to 4MB in size (the largest message IPFS will bitswap). Depending on the settings used to construct the DAG in the CAR, this may mean a split CAR size limit of around 30GiB.

class RootedCarJoiner

Join together CAR files that were split using RootedCarSplitter.

The API is the same as for CarJoiner.

class TreewalkCarSplitter

Split a CAR file into several smaller CAR files. Every CAR file has the same root CID but a different portion of the DAG. The DAG is traversed from the root node and each block is decoded and links extracted in order to determine which sub-graph to include in each CAR.

Import in the browser:

import { TreewalkCarSplitter } from 'https://cdn.skypack.dev/carbites/treewalk'

Import in Node.js:

import { TreewalkCarSplitter } from 'carbites/treewalk'

The API is the same as for CarSplitter.

class TreewalkCarJoiner

Join together CAR files that were split using TreewalkCarSplitter.

The API is the same as for CarJoiner.

Releasing

You can publish by either running npm publish in the dist directory or using npm run publish.

Contribute

Feel free to dive in! Open an issue or submit PRs.

License

Dual-licensed under MIT + Apache 2.0