untar-csv
v1.0.2
Published
Read a tar of multiple csv files as a stream of objects
Downloads
16
Readme
untar-csv
is a node.js library for reading reading data from a tar containing multiple similarly structured CSV files as a single stream of objects.
Installation
npm install untar-csv
Usage
const fs = require ('fs')
const zlib = require ('zlib')
const {TarCsvReader} = require ('untar-csv')
const reader = TarCsvReader ({
// test: entry => entry.name.indexOf ('.csv') > -1,
// delimiter: ',',
// skip: 0, // header lines
// fileNumField: '#', // how to name the file # property
// rowNumField: '##', // how to name the line # property
// empty: null,
columns: ['id', 'name'],
})
fs.createReadStream ('lots-of-data.tar.gz').pipe (zlib.createGunzip ()).pipe (reader)
for await (const {id, name} of reader) {
// do something with `id` and `name`
}
Options
Most options are effectively passed to CSVReader, see there for details.
|Name|Default value|Description|
|-|-|-|
|test
| ({name}) => true
|tar entry filter, structure described at tar-stream|
|columns
| |Array of column definitions|
|delimiter
|','
|Column delimiter|
|skip
|0
|Number of header lines to ignore|
|fileNumField
| null
| The name of the file # property (null for no numbering)
|rowNumField
| null
| The name of the line # property (null for no numbering)
|empty
|null
|The value
corresponding to zero length cell content|