npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

read-excel-file-forwps

v4.0.2

Published

Read `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.

Downloads

3

Readme

read-excel-file

Read *.xlsx files in a browser or Node.js. Parse to JSON with a strict schema.

Demo

Install

npm install read-excel-file --save

Browser

<input type="file" id="input" />
import readXlsxFile from 'read-excel-file'

const input = document.getElementById('input')

input.addEventListener('change', () => {
  readXlsxFile(input.files[0]).then((rows) => {
    // `rows` is an array of rows
    // each row being an array of cells.
  })
})

Node.js

const readXlsxFile = require('read-excel-file/node');

// File path.
readXlsxFile('/path/to/file').then((rows) => {
  // `rows` is an array of rows
  // each row being an array of cells.
})

// Readable Stream.
readXlsxFile(fs.createReadStream('/path/to/file')).then((rows) => {
  ...
})

Dates

XLSX format has no dedicated "date" type so dates are stored internally as simply numbers along with a "format" (e.g. "MM/DD/YY"). When using readXlsx() with schema parameter all dates get parsed correctly in any case. But if using readXlsx() without schema parameter (to get "raw" data) then this library attempts to guess whether a cell value is a date or not by examining the cell "format" (e.g. "MM/DD/YY"), so in most cases dates are detected and parsed automatically. For exotic cases one can pass an explicit dateFormat parameter (e.g. "MM/DD/YY") to instruct the library to parse numbers with such "format" as dates.

JSON

To convert rows to JSON pass schema option to readXlsxFile(). It will return { rows, errors } object instead of just rows.

// An example *.xlsx document:
// -----------------------------------------------------------------------------
// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |    CONTACT     |
// -----------------------------------------------------------------------------
// | 03/24/2018 |         123        |   true  |  Chemistry   | (123) 456-7890 |
// -----------------------------------------------------------------------------

const schema = {
  'START DATE': {
    prop: 'date',
    type: Date
    // Excel stores dates as integers.
    // E.g. '24/03/2018' === 43183.
    // Such dates are parsed to UTC+0 timezone with time 12:00 .
  },
  'NUMBER OF STUDENTS': {
    prop: 'numberOfStudents',
    type: Number,
    required: true
  },
  'COURSE': {
    prop: 'course',
    type: {
      'IS FREE': {
        prop: 'isFree',
        type: Boolean
        // Excel stored booleans as numbers:
        // `1` is `true` and `0` is `false`.
        // Such numbers are parsed to booleans.
      },
      'COURSE TITLE': {
        prop: 'title',
        type: String
      }
    }
  },
  'CONTACT': {
    prop: 'contact',
    required: true,
    parse(value) {
      const number = parsePhoneNumber(value)
      if (!number) {
        throw new Error('invalid')
      }
      return number
    }
  }
}

readXlsxFile(file, { schema }).then(({ rows, errors }) => {
  // `errors` have shape `{ row, column, error, value }`.
  errors.length === 0

  rows === [{
    date: new Date(2018, 2, 24),
    numberOfStudents: 123,
    course: {
      isFree: true,
      title: 'Chemistry'
    },
    contact: '+11234567890',
  }]
})

There are also some additional exported types:

  • "Integer" for parsing integer Numbers.
  • "URL" for parsing URLs.
  • "Email" for parsing email addresses.

A schema entry for a column can also have a validate(value) function for validating the parsed value. It must throw an Error if the value is invalid.

A React component for displaying error info could look like this:

import { parseExcelDate } from 'read-excel-file'

function ParseExcelError({ children: error }) {
  // Human-readable value.
  let value = error.value
  if (error.type === Date) {
    value = parseExcelDate(value).toString()
  }
  // Error summary.
  return (
    <div>
      <code>"{error.error}"</code>
      {' for value '}
      <code>"{value}"</code>
      {' in column '}
      <code>"{error.column}"</code>
      {error.type && ' of type '}
      {error.type && <code>"{error.type.name}"</code>}
      {' in row '}
      <code>"{error.row}"</code>
    </div>
  )
}

When using a schema there's also an optional transformData(data) parameter which can be used for the cases when the spreadsheet rows/columns aren't in the correct format. For example, the heading row may be missing, or there may be some purely presentational or empty rows. Example:

readXlsxFile(file, {
  schema,
  transformData(data) {
    // Adds header row to the data.
    return ['ID', 'NAME', ...].concat(data)
    // Removes empty rows.
    return data.filter(rows => row.filter(column => column !== null).length > 0)
  }
})

Browser compatibility

Node.js *.xlxs parser uses xpath and xmldom packages for XML parsing. The same packages could be used in a browser because all modern browsers (except IE 11) have native DOMParser built-in which could is used instead (meaning smaller footprint and better performance) but since Internet Explorer 11 support is still required the browser version doesn't use the native DOMParser and instead uses xpath and xmldom packages for XML parsing just like the Node.js version.

Advanced

By default it reads the first sheet in the document. If you have multiple sheets in your spreadsheet then pass either sheet: number (sheet index, starting from 1) or sheet: string (sheet name) as part of the options argument (options.sheet is 1 by default):

readXlsxFile(file, { sheet: 2 }).then((data) => {
  ...
})
readXlsxFile(file, { sheet: 'Sheet1' }).then((data) => {
  ...
})

To get the list of sheets one can pass getSheets: true option:

readXlsxFile(file, { getSheets: true }).then((sheets) => {
  // sheets === [{ name: 'Sheet1' }, { name: 'Sheet2' }]
})

References

For XML parsing xmldom and xpath are used.

License

MIT