npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@jrc03c/js-csv-helpers

v0.0.31

Published

This is a little helper library to complement [@jrc03c/js-math-tools](https://github.com/jrc03c/js-math-tools) and [@jrc03c/js-data-science-helpers](https://github.com/jrc03c/js-data-science-helpers). It's a relatively thin wrapper around [`papaparse`](ht

Downloads

50

Readme

Intro

This is a little helper library to complement @jrc03c/js-math-tools and @jrc03c/js-data-science-helpers. It's a relatively thin wrapper around papaparse. All it does is load CSV files or strings as DataFrame objects and save DataFrame objects as CSV files or strings.

Installation

npm install --save https://github.com/jrc03c/js-csv-helpers

Usage

Node & bundlers:

const { loadCSV, saveCSV } = require("js-csv-helpers")

async function doStuff() {
  // load
  const df = await loadCSV("path/to/my-data.csv")

  // save
  await saveCSV("path/to/other-data.csv", df)
}

doStuff()

Browser:

<script src="path/to/dist/js-csv-helpers"></script>
<script>
  const { loadCSV, saveCSV } = JSCSVHelpers

  async function doStuff() {
    // load
    const df = await loadCSV("path/to/my-data.csv")

    // save
    await saveCSV("path/to/other-data.csv", df)
  }

  doStuff()
</script>

NOTE: Usage in both environments is basically identical except for one thing: In the browser, saveCSV takes a filename and a DataFrame; but in Node, saveCSV takes a path and a DataFrame. That's because the browser can only download files; it can't (in JS) specify where the files ought to be saved.

API

loadCSV(path, config)

Given a path (and optionally a config object), this function returns a Promise that resolves to a DataFrame.

parse(csvString, config)

Given a CSV string and a config object, this function returns a DataFrame (synchronously).

saveCSV(path, df, config)

Given a path and a DataFrame (data) (and optionally a config object), this function returns a Promise that resolves to undefined.

streamLoadCSV(path, config)

NOTE: This function currently only works for streaming files from disk. I plan to add support for streaming files over the web but just haven't gotten to it yet.

Given a path and a config object, this function returns chunks of a DataFrame asynchronously. A "chunk" just means a subset of the entire CSV file containing just a few rows. Chunks are returned in same order in which they appear in the CSV file; i.e., if you're streaming the file 10 rows at a time, then the first chunk will contain rows 1-10, the second chunk will contain rows 11-20, and so on. The number of rows in each chunk can be defined using a rowsPerChunk property on the config object.

For example, if I wanted to stream a large CSV 10 rows at a time, I'd do this:

const { streamLoadCSV } = require("@jrc03c/js-csv-helpers")

!(async () => {
  const stream = streamLoadCSV("my-data.csv", {
    inferTypes: true,
    rowsPerChunk: 10,
  })

  for await (const chunk of stream) {
    chunk.print()
  }
})()

unparse(df, config)

Given a DataFrame and a config object, this function returns a CSV string (synchronously).

Configuration

Loading & parsing

This library is basically a thin wrapper around papaparse. Any configuration object you could pass into this library's functions will be passed directly into papaparse's functions. See their documentation for more info.

As of today, Papa's default configuration values for parsing are:

{
  beforeFirstChunk: undefined,
  chunk: undefined,
  chunkSize: undefined,
  comments: false,
  complete: undefined,
  delimiter: "",
  delimitersToGuess: [",", "\t", "|", ";", papa.RECORD_SEP, papa.UNIT_SEP],
  download: false,
  downloadRequestBody: undefined,
  downloadRequestHeaders: undefined,
  dynamicTyping: false,
  encoding: "",
  error: undefined,
  escapeChar: '"',
  fastMode: undefined,
  newline: "",
  preview: 0,
  quoteChar: '"',
  skipEmptyLines: false,
  step: undefined,
  transform: undefined,
  transformHeader: undefined,
  withCredentials: undefined,
  worker: false,

  // I've changed this value from the Papa defaults because, at least for my
  // purposes, I anticipate that most datasets will include a header row.
  header: true,

  // I'm adding this option in case a dataset has (or should have) an index
  // column (i.e., a first column filled with row names).
  index: false,

  // I'm also adding my own option to infer types using my `inferType` function
  // in @jrc03c/js-math-tools. Papa offers a "dynamicTyping" option, but I
  // think maybe mine is a little more extensive (i.e., I think it infers more
  // data types, but may not necessarily be more robust). I'm willing to be
  // wrong about that, though. By default, this value is set to `false`, which
  // means that the returned `DataFrame` will only contain strings.
  inferTypes: false,
}

This library only adds one extra option to the configuration object in the loadCSV function: setting "inferTypes" to true or false enables or disables dynamic type inference. By default, papaparse doesn't try to figure out what kinds of data your CSV file contains; it merely returns a matrix of strings. They provide an option called "dynamicTyping" which I think asks papaparse to try to infer data types, but I don't think it's quite as extensive as the one I've written here.

Here's an example of how to use it:

// use this library's type inference
loadCSV("path/to/my-data.csv", { inferTypes: true })

// use papaparse's type inference
loadCSV("path/to/my-data.csv", { dynamicTyping: true })

Unparsing & saving

As of today, Papa's default configuration values for unparsing are:

{
  columns: null,
  delimiter: ",",
  escapeChar: '"',
  header: true,
  quoteChar: '"',
  quotes: false,
  skipEmptyLines: false,

  // This is the only value that's been changed from Papa's defaults.
  newline: "\n",
}

Here's an example of how to use it:

saveCSV("path/to/my-data.csv", myDataFrame, { delimiter: "\t" })