npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

buffered-csv

v1.0.1

Published

Generate csv files with buffered output.

Downloads

5

Readme

buffered-csv

  1. Introduction
  2. Installation
  3. Usage 3.1 Minimal 3.2 Buffering output 3.3 Dynamic filenames 3.4 All options
  4. Field handling 4.1 Autodetect 4.2 Prior specification
  5. API 5.1 Class: buffered-csv.File     5.1.1 Event: 'data'     5.1.2 Event: 'error'     5.1.3 file.add(data)     5.1.4 file.complete()     5.1.5 file.flush()

Introduction

Generation of csv files often involves appending the generated csv file line by line as the data is generated. Some use cases benefit from storing the data in memory until a threshold is reached upon which all lines in memory are appended to the file at once. This module does just that:

  • Buffering generated csv lines in memory:
    • Writing to file only after the buffer acccumulates a specified number of lines.
    • Writing to file at preset intervals.
    • Or traditionally writing line-by-line as data is generated.
  • Dynamic filename generation for each write.
  • Automatic detection of field headers based on buffer data on file creation.

Installation

$ npm install buffered-csv

Usage

Minimal

The following displays the bare minimum usage using all defaults and no buffering. Lines are written to file line-by-line, similar to classic CSV generation.

const csv = require('buffered-csv');
var file = new csv.File({
    path: 'celebrities.csv'
});
file.add({
    Name: 'Albert Einstein',
    Expertise: 'Relativity'
});
file.add({
    Name: 'Galileo Galilei',
    Expertise: 'Gravity',
    Birthyear: 1564
});
file.add({
    Name: 'Shen Kuo',
    Expertise: 'Astronomy',
});
file.complete();

Buffering output

Turning on buffering only requires passing flushing parameters to the constructor, like so:

var file = new csv.File({
    path: 'celebrities.csv',
    flushInterval: 5000,
    flushLines: 50
});

This flushes data from memory to file:

  • every 5000 milliseconds.
  • or as soon as the buffer contains 50 lines.
  • or as soon as file.close() is called.

In our minimal usage example we only add three lines, less than the specified flushLines of 50. We also add them well within 5000ms. If we used these settings in the example, the data would be written to file no earlier then when file.complete() is called.

Dynamic filenames

The location of the output file is given through the path option and may be specified as a string or as a callback function. This allows for the use case where the writes are not only buffered, but also distributed across different writes:

var file = new csv.File({
    path: function() {
        return 'celebrities_' + (new Date()).getTime() + '.csv';
    },
    flushInterval: 5000,
    flushLines: 50
});

If chosen to write headers, then headers are written to each file. Buffered-csv maintains header information and field layout across writes and files. Concatenating the resulting output files into a single csv will generate a valid file.

All options

The following is a full list of all options that may be passed to the constructor:

|Option |Default |Description | |:------------------|:-------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |encoding |'utf-8'. |Character encoding for the output file. | |delimeter |',' |The character used to seperate values on a single line. | |quote |'"' |The character used to enclose quoted values. | |escape |'\' |The escape character to use if the quote character is present in a quoted value. | |nullValue |'NULL' |A string to insert for fields that have an undefined value. This string is inserted verbatim in the csv output without any quoting or escaping. | |eol |System default|A string to designate the end of line. Typically '\n' or '\r\n'. | |headers |true |True if the first line of each output file must contain header information. False otherwise. | |overwrite |true |True if the output files must be overwritten upon the first write to that file. If dynamic filenames are used each file will be overwritten upon the first write to that file. | |flushInterval |0 (disabled) |The contents of the csv buffer are written to file after each specified interval in milliseconds. Ignored if 0. * | |flushLines |0 (disabled) |The contents of the csv buffer are written to file as soon as the buffer contains the specified number of lines. Ignored if 0. * | |fields |{} |A specification of fields and how to handle them, in the following format: { <name>: { quoted: true | false } }

Note: If both flushInterval and flushLines are 0 buffered-csv operates in classic mode. Lines are written to file immediately when added.

Field handling

buffered-csv tracks field order and field typing across all writes and all files. If data is added as an array buffered-csv will always assume all entries in the array match up with any previously established field order. If data is added as a key/value map then buffered-csv will ensure fields will always be sent to file in the same order.

Autodetect

If data is added as a key/value map previously unknown fields will automatically be detected and assumed to be quoted. This works well for many simple csv scenario's but there are drawbacks:

In our minimal usage example a Birthyear...

  • ... is not specified for Albert Einstein. His csv line will have two fields.
  • ... is specified for Galileo Galilei. His csv line will have three fields.
  • ... is not specified for Shen Kuo. His csv line will have three fields with a nullValue for Birthyear.

Without buffering data for Albert Einstein is sent to file immediately, which requires immmediate generation of a header line based on fields known so far. This excludes the Birthyear field which is only introduced later.

In our buffering output example data is only sent to file after Galileo Galilei has been added to the buffer. As a consequence...

  • ... the header line will include the Birthyear field.
  • ... Albert Einstein's csv line will have three fields with a nullValue for Birthyear.

With dynamic filenames field information is maintained across all files. If during the 5th write on the 3rd file a field is added, it will show up in the headers of the 6th file and any subsequent files. It will not show in the headers of files 1 through 2. It will also not show in the headers of file 3 (it would have to be detected on the 1st write of file 3).

Prior specification

Prior specification of fields solves the drawbacks of autodetect at the expense of a requirement for prior knowledge. Fields may be specified to the constructor like so:

var file = new csv.File({
    path: 'celebrities.csv',
    fields: {
        Expertise: {
            quoted: true
        },
        Name: {
            quoted: true
        },
        Birthyear: {
            quoted: false
        }
    }
});

In this case Birthyear will show up in all headerlines in all files. Additionally this method specifies which fields are quoted. Finally, this also sets the field order so that Expertise will be the first field listed on each line.

Using pior specification does not disable autodetect and does not enable an error mechanism if unkown field names are added.

API

Class: buffered-csv.File

Event: 'data'

'data' is emitted each time the contents of the buffer is written to file. path contains the actual file targeted and csv the actual content written to the file.

file.on('data', function(path, csv) {
}

Event: 'error'

'error' is emitted if writing to a file failed. buffered-csv leaves handling this error up to you. One way to handle it could be to perform a rewrite to the file yourself or saving the contents at an alternative location.

file.on('error', function(err, path, csv) {
}

An 'error' event is always preceeded by a 'data' event.

file.add(data)

Adds a line of data to the csv file. The line is added to the buffer initially and once the flush thresholds are met, the data is sent to file.

If data is an array then the data is interpreted as order-first. Array items are interpreted as fields according to their position in the array. This influences their representation in the file (e.g. quoted or not).

If data is a key/value map then the data is interpreted as name-first. Each key is interpreted as a field name. Data is sent to the buffer with a field order that matches any earlier established order of fields. Previously unseen keys are autodiscovered as headers.

file.complete()

Finishes generation of the csv file by flushing any remaining data in the buffer and terminating any running timers.

file.flush()

Explicitly flushes the buffer to file.