npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@agen/dsv

v0.0.8

Published

Split lines to arrays using Async Iterators

Downloads

22

Readme

@agen/dsv

This package contains utility methods splitting lines to arrays. Basically these methods are wrappers for the d3-dsv library.

List of methods:

arraysFromDsv method

This method splits individual lines to arrays and returns an AsyncIterator over them.

Parameters:

  • delimiter - a delimiter symbol (like ';', '\t', ',', '|' ) or a function returning the symbol

Returns an AsyncGenerator over arrays.

Example 1: Splitting lines to arrays:

import { arraysFromDsv } from '@agen/dsv';

const f = arraysFromDsv(',');
const lines = [
  'John,Doe,120 jefferson st.,Riverside, NJ, 08075',
  'Jack,McGinnis,220 hobo Av.,Phila, PA,09119',
  '"John ""Da Man""",Repici,120 Jefferson St.,Riverside, NJ,08075',
  'Stephen,Tyler,"7452 Terrace ""At the Plaza"" road",SomeTown,SD, 91234',
  ',Blankman,,SomeTown, SD, 00298',
  '"Joan ""the bone"", Anne",Jet,"9th, at Terrace plc",Desert City,CO,00123'
]

for await (let array of f(lines)) {
  console.log(array);
}

// Output:
// [ "John", "Doe", "120 jefferson st.", "Riverside", " NJ", " 08075" ]
// ["Jack", "McGinnis", "220 hobo Av.", "Phila", " PA", "09119"]
// ['John "Da Man"', "Repici", "120 Jefferson St.", "Riverside", " NJ", "08075 ]
// [ "Stephen", "Tyler", '7452 Terrace "At the Plaza" road', "SomeTown", "SD", " 91234" ]
// ["", "Blankman", "", "SomeTown", " SD", " 00298"]
// [ 'Joan "the bone", Anne', "Jet", "9th, at Terrace plc", "Desert City", "CO", "00123" ]

Example 2: Reading binary content from the stream, decoding and splitting to arrays:


import fs from 'fs';
import { compose } from '@agen/utils';
import { lines } from '@agen/encoding';
import { arraysFromDsv } from '@agen/dsv';

const f = compose(
  lines(), // Decode binary stream to text and split to lines
  arraysFromDsv(';') // Split individual lines to arrays using the ';' separator
);
const input = s.createReadStream('./data.csv');

for await (let array of f(input)) {
  console.log(array);
}

arraysToDsv method

This method serializes arrays as individual lines.

Parameters:

  • delimiter - a delimiter symbol (like ';', '\t', ',', '|' ) or a function returning the delimiter

Returns an AsyncGenerator over serialized lines.

Example: Serialize arrays to lines:

import { arraysToDsv } from '@agen/dsv';

const f = arraysToDsv(';');
const arrays = [
  [ "John", "Doe", "120 jefferson st.", "Riverside", " NJ", " 08075" ],
  ["Jack", "McGinnis", "220 hobo Av.", "Phila", " PA", "09119"],
  ['John "Da Man"', "Repici", "120 Jefferson St.", "Riverside", " NJ", "08075 "],
  [ "Stephen", "Tyler", '7452 Terrace "At the Plaza" road', "SomeTown", "SD", " 91234" ],
  ["", "Blankman", "", "SomeTown", " SD", " 00298"],
  ['Joan "the bone", Anne', "Jet", "9th, at Terrace plc", "Desert City", "CO", "00123" ]
]

for await (let line of f(arrays)) {
  console.log(line);
}

const f = arraysToDsv(';');
const arrays = [
  ["John", "Doe", "120 jefferson st.", "Riverside", " NJ", " 08075"],
  ["Jack", "McGinnis", "220 hobo Av.", "Phila", " PA", "09119"],
  ['John "Da Man"', "Repici", "120 Jefferson St.", "Riverside", " NJ", "08075 "],
  ["Stephen", "Tyler", '7452 Terrace "At the Plaza" road', "SomeTown", "SD", " 91234"],
  ["", "Blankman", "", "SomeTown", " SD", " 00298"],
  ['Joan "the bone", Anne', "Jet", "9th, at Terrace plc", "Desert City", "CO", "00123"]
]

for await (let line of f(arrays)) {
  console.log(line);
}
// Output:
// John;Doe;120 jefferson st.;Riverside; NJ; 08075
// Jack;McGinnis;220 hobo Av.;Phila; PA;09119
// "John ""Da Man""";Repici;120 Jefferson St.;Riverside; NJ;08075 
// Stephen;Tyler;"7452 Terrace ""At the Plaza"" road";SomeTown;SD; 91234
// ;Blankman;;SomeTown; SD; 00298
// "Joan ""the bone"", Anne";Jet;9th, at Terrace plc;Desert City;CO;00123

guessDelimiter method

This method allows to guess which delimiter is used in a DSV file. To do so it tries to split provided stream of lines using various delimiters and comparing results. The delimiter which generates the bigger number of cells for each line is returned as a winner.

Parameters:

  • len - number of lines to analyse
  • delimiters - list of possible delimiters to check

Returns:

  • delimiter - an async function resolving the delimiter
  • delimiter.guess : an AsyncGenerator (async function* guess(it)) accepting strings and analysing them to guess the best delimiter; all consumed lines are yielded from the internal buffer

Example 1:

import { guessDelimiter } from '@agen/dsv';

const lines = [
  'John;Doe;120 jefferson st.;Riverside; NJ; 08075',
  'Jack;McGinnis;220 hobo Av.;Phila; PA;09119',
  '"John ""Da Man""";Repici;120 Jefferson St.;Riverside; NJ;08075',
  'Stephen;Tyler;"7452 Terrace ""At the Plaza"" road";SomeTown;SD; 91234',
  ';Blankman;;SomeTown; SD; 00298',
  '"Joan ""the bone"", Anne";Jet;9th, at Terrace plc;Desert City;CO;00123'
]
const f = guessDelimiter(3); // Use 3 first line to guess the best delimiter

for await (let line of f.guesser(lines)) {
  // Do nothing. Just consume lines...
}

const delimiter = await f();
console.log(`Delimiter: "${delimiter}"`);

// Output:
// Delimiter: ";"

Example 2: Use automatic delimiter detection to split DSV files:


import fs from 'fs';
import { compose } from '@agen/utils';
import { lines } from '@agen/encoding';
import { arraysFromDsv, guessDelimiter } from '@agen/dsv';

const delimiter = guessDelimiter(10); // Use the first 10 lines to detect delimiter
const f = compose(
  lines(), // Decode binary stream to text and split to lines
  delimiter.guesser(), // Detects delimiters and notifies the "arraysFromDsv"
  arraysFromDsv(delimiter) // Split individual lines to arrays using the ';' separator
);
const input = s.createReadStream('./data.csv');

for await (let array of f(input)) {
  console.log(array);
}

newDelimiterGuesser method

This method creates a 'guesser' trying to detect the best delimiter for the provided lines.

Parameters:

  • delimiters - list of possible delimiters to check; by default it checks the following values: ;, , (comma), | (pipe), \t (tab)

Returns an object with two methods:

  • update(line) - this method is used to push new lines to analyse
  • done() - this method returns the delimiter providing the best splitting results

Example:


import { newDelimiterGuesser } from '@agen/dsv';

const { update, done } = newDelimiterGuesser();

// Provide three lines to analyse:
update('John;Doe;120 jefferson st.;Riverside; NJ; 08075');
update('Jack;McGinnis;220 hobo Av.;Phila; PA;09119');
update('"John ""Da Man""";Repici;120 Jefferson St.;Riverside; NJ;08075');

// Get the best guess:
const delimiter = done();
console.log(`Delimiter: "${delimiter}"`);
// Output:
// Delimiter: ";"