@vascosantos/ipfs-unixfs-importer
v0.0.12
Published
JavaScript implementation of the UnixFs importer used by IPFS
Downloads
672
Readme
ipfs-unixfs-importer
JavaScript implementation of the layout and chunking mechanisms used by IPFS to handle Files
Lead Maintainer
Table of Contents
Install
> npm install ipfs-unixfs-importer
Usage
Example
Let's create a little directory to import:
> cd /tmp
> mkdir foo
> echo 'hello' > foo/bar
> echo 'world' > foo/quux
And write the importing logic:
const { importer } = require('ipfs-unixfs-importer')
// Import path /tmp/foo/bar
const source = [{
path: '/tmp/foo/bar',
content: fs.createReadStream(file)
}, {
path: '/tmp/foo/quxx',
content: fs.createReadStream(file2)
}]
// You need to create and pass an ipld-resolve instance
// https://github.com/ipld/js-ipld-resolver
for await (const entry of importer(source, ipld, options)) {
console.info(entry)
}
When run, metadata about DAGNodes in the created tree is printed until the root:
{
cid: CID, // see https://github.com/multiformats/js-cid
path: 'tmp/foo/bar',
unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
{
cid: CID, // see https://github.com/multiformats/js-cid
path: 'tmp/foo/quxx',
unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
{
cid: CID, // see https://github.com/multiformats/js-cid
path: 'tmp/foo',
unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
{
cid: CID, // see https://github.com/multiformats/js-cid
path: 'tmp',
unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
API
const { importer } = require('ipfs-unixfs-importer')
const stream = importer(source, ipld [, options])
The importer
function returns an async iterator takes a source async iterator that yields objects of the form:
{
path: 'a name',
content: (Buffer or iterator emitting Buffers),
mtime: (Number representing seconds since (positive) or before (negative) the Unix Epoch),
mode: (Number representing ugo-rwx, setuid, setguid and sticky bit)
}
stream
will output file info objects as files get stored in IPFS. When stats on a node are emitted they are guaranteed to have been written.
ipld
is an instance of the IPLD Resolver
The input's file paths and directory structure will be preserved in the dag-pb
created nodes.
options
is an JavaScript option that might include the following keys:
wrapWithDirectory
(boolean, defaults to false): if true, a wrapping node will be createdshardSplitThreshold
(positive integer, defaults to 1000): the number of directory entries above which we decide to use a sharding directory builder (instead of the default flat one)chunker
(string, defaults to"fixed"
): the chunking strategy. Supports:fixed
rabin
avgChunkSize
(positive integer, defaults to262144
): the average chunk size (rabin chunker only)minChunkSize
(positive integer): the minimum chunk size (rabin chunker only)maxChunkSize
(positive integer, defaults to262144
): the maximum chunk sizestrategy
(string, defaults to"balanced"
): the DAG builder strategy name. Supports:flat
: flat list of chunksbalanced
: builds a balanced treetrickle
: builds a trickle tree
maxChildrenPerNode
(positive integer, defaults to174
): the maximum children per node for thebalanced
andtrickle
DAG builder strategieslayerRepeat
(positive integer, defaults to 4): (only applicable to thetrickle
DAG builder strategy). The maximum repetition of parent nodes for each layer of the tree.reduceSingleLeafToSelf
(boolean, defaults totrue
): optimization for, when reducing a set of nodes with one node, reduce it to that node.hamtHashFn
(async function(string) Buffer): a function that hashes file names to create HAMT shardshamtBucketBits
(positive integer, defaults to8
): the number of bits at each bucket of the HAMTprogress
(function): a function that will be called with the byte length of chunks as a file is added to ipfs.onlyHash
(boolean, defaults to false): Only chunk and hash - do not write to diskhashAlg
(string): multihash hashing algorithm to usecidVersion
(integer, default 0): the CID version to use when storing the data (storage keys are based on the CID, including it's version)rawLeaves
(boolean, defaults to false): When a file would span multiple DAGNodes, if this is true the leaf nodes will not be wrapped inUnixFS
protobufs and will instead contain the raw file bytesleafType
(string, defaults to'file'
) what type of UnixFS node leaves should be - can be'file'
or'raw'
(ignored whenrawLeaves
istrue
)blockWriteConcurrency
(positive integer, defaults to 10) How many blocks to hash and write to the block store concurrently. For small numbers of large files this should be high (e.g. 50).fileImportConcurrency
(number, defaults to 50) How many files to import concurrently. For large numbers of small files this should be high (e.g. 50).
Overriding internals
Several aspects of the importer are overridable by specifying functions as part of the options object with these keys:
chunkValidator
(function): Optional function that supports the signatureasync function * (source, options)
- This function takes input from the
content
field of imported entries. It should transform them intoBuffer
s, throwing an error if it cannot. - It should yield
Buffer
objects constructed from thesource
or throw anError
- This function takes input from the
chunker
(function): Optional function that supports the signatureasync function * (source, options)
wheresource
is an async generator andoptions
is an options object- It should yield
Buffer
objects.
- It should yield
bufferImporter
(function): Optional function that supports the signatureasync function * (entry, ipld, options)
- This function should read
Buffer
s fromsource
and persist them usingipld.put
or similar entry
is the{ path, content }
entry, whereentry.content
is an async generator that yields Buffers- It should yield functions that return a Promise that resolves to an object with the properties
{ cid, unixfs, size }
wherecid
is a CID,unixfs
is a UnixFS entry andsize
is aNumber
that represents the serialized size of the IPLD node that holds the buffer data. - Values will be pulled from this generator in parallel - the amount of parallelisation is controlled by the
blockWriteConcurrency
option (default: 10)
- This function should read
dagBuilder
(function): Optional function that supports the signatureasync function * (source, ipld, options)
- This function should read
{ path, content }
entries fromsource
and turn them into DAGs - It should yield a
function
that returns aPromise
that resolves to{ cid, path, unixfs, node }
wherecid
is aCID
,path
is a string,unixfs
is a UnixFS entry andnode
is aDAGNode
. - Values will be pulled from this generator in parallel - the amount of parallelisation is controlled by the
fileImportConcurrency
option (default: 50)
- This function should read
treeBuilder
(function): Optional function that supports the signatureasync function * (source, ipld, options)
- This function should read
{ cid, path, unixfs, node }
entries fromsource
and place them in a directory structure - It should yield an object with the properties
{ cid, path, unixfs, size }
wherecid
is aCID
,path
is a string,unixfs
is a UnixFS entry andsize
is aNumber
.
- This function should read
Contribute
Feel free to join in. All welcome. Open an issue!
This repository falls under the IPFS Code of Conduct.