npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@scalableminds/saxophone

v0.5.0

Published

Fast and lightweight event-driven XML parser in pure JavaScript

Downloads

1

Readme

Saxophone 🎷

Fast and lightweight event-driven streaming XML parser in pure JavaScript.

npm version npm downloads build status coverage dependencies status

Saxophone is inspired by SAX parsers like sax-js and easysax: it does not generate any DOM while parsing documents. Instead, it reports parsing events for each tag or text node encountered. This means that Saxophone has a really low memory footprint.

The parser does not keep track of the document state while parsing and does not check whether the document is well-formed or valid, making it super-fast (see benchmarks below).

This library is best suited when you need to extract simple data out of an XML document that you know is well-formed. The parser will not report precise errors in case of syntax problems. An example would be reading data from an API endpoint.

Installation

This library works both in Node.JS ≥4.0 and recent browsers. To install with npm:

$ npm install --save saxophone

Benchmarks

| Library | Operations per second (higher is better) | |--------------------|-----------------------------------------:| | Saxophone | 1,099 ops/sec ±2.16% | | EasySax | 1,033 ops/sec ±2.52% | | node-expat | 360 ops/sec ±3.76% | | libxmljs.SaxParser | 236 ops/sec ±3.94% | | sax-js | 113 ops/sec ±3.87% |

To run the benchmarks by yourself, use the following commands:

$ git clone https://github.com/matteodelabre/saxophone.git
$ cd saxophone
$ npm install
$ npm install easysax node-expat libxmljs sax
$ npm run benchmark

Tests and coverage

To run tests and check coverage, use the following commands:

$ git clone https://github.com/matteodelabre/saxophone.git
$ cd saxophone
$ npm install
$ npm test
$ npm run coverage

Usage

Example

const Saxophone = require('saxophone');
const parser = Saxophone();

// called whenever an opening tag is found in the document,
// such as <example id="1" /> - see below for a list of events
parser.on('tagopen', tag => {
    console.log(
        `Open tag "${tag.name}" with attributes: ${JSON.stringify(Saxophone.parseAttrs(tag.attrs))}.`
    );
});

// called when parsing the document is done
parser.on('end', () => {
    console.log('Parsing finished.');
});

// triggers parsing - remember to set up listeners before
// calling this method
parser.parse('<root><example id="1" /><example id="2" /></root>');

Output:

Open tag "root" with attributes: {}.
Open tag "example" with attributes: {"id":"1"}.
Open tag "example" with attributes: {"id":"2"}.
Parsing finished.

Example (streaming)

Same example as above but with Streams.

const Saxophone = require('saxophone');
const parser = Saxophone();

// called whenever an opening tag is found in the document,
// such as <example id="1" /> - see below for a list of events
parser.on('tagopen', tag => {
    console.log(
        `Open tag "${tag.name}" with attributes: ${JSON.stringify(Saxophone.parseAttrs(tag.attrs))}.`
    );
});

// called when parsing the document is done
parser.on('end', () => {
    console.log('Parsing finished.');
});


// stdin is '<root><example id="1" /><example id="2" /></root>'
process.stdin.setEncoding('utf8');
process.stdin.pipe(parser);

Output:

Open tag "root" with attributes: {}.
Open tag "example" with attributes: {"id":"1"}.
Open tag "example" with attributes: {"id":"2"}.
Parsing finished.

API

Saxophone()

Returns a new Saxophone instance. This is a factory method, so you must not prefix it with the new keyword.

Saxophone#on(), Saxophone#removeListener(), ...

Saxophone composes with the EventEmitter methods. To work with listeners, check out Node's documentation.

Saxophone#parse(xml)

Triggers the actual parsing of a whole document. This method will fire registered listeners so you need to set them up before calling it.

xml is a string containing the XML that you want to parse. At this time, Saxophone does not support Buffers.

Saxophone#write(xml)

Triggers one step of the parsing. This method will fire registered listeners so you need to set them up before calling it.

xml is a string containing a chunk of the XML that you want to parse.

Saxophone#end(xml = "")

Triggers the last step of the parsing. This method will fire registered listeners so you need to set them up before calling it. Same as Saxophone#write(xml), but closes the stream.

xml is a string containing a chunk of the XML that you want to parse.

Saxophone.parseAttrs(attrs)

Parses a string list of XML attributes, as produced by the main parsing algorithm. This is not done automatically because it may not be required for every tag and it takes some time.

The result is an object associating the attribute names (as object keys) to their attribute values (as object values).

Saxophone.parseEntities(text)

Parses a piece of XML text and expands all XML entities inside it to the character they represent. Just like attributes, this is not parsed automatically because it takes some time.

This ignores invalid entities, including unrecognized ones, leaving them as-is.

Events

tagopen

Emitted when an opening tag is parsed. This encompasses both regular tags and self-closing tags. An object is passed with the following data.

  • name: name of the parsed tag.
  • attrs: attributes of the tag (as a string). To parse this string, use Saxophone.parseAttrs.
  • isSelfClosing: true if the tag is self-closing.

tagclose

Emitted when a closing tag is parsed. An object containing the name of the tag is passed.

error

Emitted when a parsing error is encountered while reading the XML stream such that the rest of the XML cannot be correctly interpreted.

Because this library's goal is not to provide accurate error reports, the passed error will only contain a short description of the syntax error (without giving the position, for example).

processinginstruction

Emitted when a processing instruction (such as <? contents ?>) is parsed. An object with the contents of the processing instruction is passed.

text

Emitted when a text node between two tags is parsed. An object with the contents of the text node is passed. You might need to expand XML entities inside the contents of the text node, using Saxophone.parseEntities.

cdata

Emitted when a CDATA section (such as <![CDATA[ contents ]]>) is parsed. An object with the contents of the CDATA section is passed.

comment

Emitted when a comment (such as <!-- contents -->) is parsed. An object with the contents of the comment is passed.

end

Emitted after all events, without arguments.

License

Released under the MIT license.
See the full license text.