npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

xml-flow-legible

v2.0.0

Published

An XML/HTML stream reader, now with less suck.

Downloads

40

Readme

xml-flow

NPM version Build status dependencies Test Coverage

Dealing with XML data can be frustrating. Especially if you have a whole-lot of it. Most XML readers work on the entire XML document as a String: this can be problematic if you need to read very large XML files. With xml-flow, you can use streams to only load a small part of an XML document into memory at a time.

xml-flow has only one dependency, sax-js. This means it will run nicely on windows environments.

Installation

$ npm install xml-flow

Getting started

xml-flow tries to keep the parsed output as simple as possible. Here's an example:

Input File

<root>
  <person>
    <name>Bill</name>
    <id>1</id>
    <age>27</age>
  </person>
  <person>
    <name>Sally</name>
    <id>2</id>
    <age>29</age>
  </person>
  <person>
    <name>Kelly</name>
    <id>3</id>
    <age>37</age>
  </person>
</root>

Usage

var fs = require('fs')
  , flow = require('xml-flow')
  , inFile = fs.createReadStream('./your-xml-file.xml')
  , xmlStream = flow(inFile)
;

xmlStream.on('tag:person', function(person) {
  console.log(person);
});

Output

{name: 'Bill', id: '1', age: '27'}
{name: 'Sally', id: '2', age: '29'}
{name: 'Kelly', id: '3', age: '37'}

Features

Attribute-only Tags

The above example shows the of an XML document with no attributes. What about the opposite?

Input
<root>
    <person name="Bill" id="1" age="27"/>
    <person name="Sally" id="2" age="29"/>
    <person name="Kelly" id="3" age="37"/>
</root>
Output
{name: 'Bill', id: '1', age: '27'}
{name: 'Sally', id: '2', age: '29'}
{name: 'Kelly', id: '3', age: '37'}

Both Attributes and Subtags

When you have tags that have both Attributes and subtags, here's how the output looks:

Input
<root>
    <person name="Bill" id="1" age="27">
        <friend id="2"/>
    </person>
    <person name="Sally" id="2" age="29">
        <friend id="1"/>
        <friend id="3"/>
    </person>
    <person name="Kelly" id="3" age="37">
        <friend id="2"/>
        Kelly likes to ride ponies.
    </person>
</root>
Output
{
    $attrs: {name: 'Bill', id: '1', age: '27'},
    friend:'2'
}
{
    $attrs: {name: 'Sally', id: '2', age: '29'},
    friend: ['1', '3']
}
{
    $attrs: {name: 'Kelly', id: '3', age: '37'},
    friend: '2',
    $text: 'Kelly likes to ride ponies.'
}

Read as Markup

If you need to keep track of sub-tag order within a tag, or if it makes sense to have a more markup-style object model, here's how it works:

Input
<div class="science">
    <h1>Title</h>
    <p>Some introduction</p>
    <h2>Subtitle</h>
    <p>Some more text</p>
    This text is not inside a p-tag.
</div>
Output
{
    $attrs: {class: 'science'},
    $markup: [
        {$name: 'h1', $text: 'Title'},
        {$name: 'p', $text: 'Some Introduction'},
        {$name: 'h2', $text: 'Subtitle'},
        {$name: 'p', $text: 'Some more text'},
        'This text is not inside a p-tag.'
    ]
}

Options

You may add a second argument when calling the function, as flow(stream, options). All are optional:

  • strict - Boolean. Default = false. Refer to sax-js documentation for more info.
  • lowercase - Boolean. Default = true. When not in strict mode, all tags are lowercased, or uppercased when set to false.
  • trim - Boolean. Default = true. Whether or not to trim leading and trailing whitespace from text
  • normalize - Boolean. Default = true. Turns all whitespace into a single space.
  • preserveMarkup - One of flow.ALWAYS, flow.SOMETIMES (default), or flow.NEVER. When set to ALWAYS, All subtags and text are stored in the $markup property with their original order preserved. When set to NEVER, all subtags are collected as separate properties. When set to SOMETIMES, markup is preserved only when subtags have non-contiguous repetition.
  • simplifyNodes - Boolean. Default = true. Whether to drop empty $attrs, pull properties out of the $attrs when there are no subtags, or to only use a String instead of an object when $text is the only property.
  • useArrays - One of flow.ALWAYS, flow.SOMETIMES (default), or flow.NEVER. When set to ALWAYS, All subtags and text are enclosed in arrays, even if there's only one found. When set to NEVER, only the first instance of a subtag or text node are kept. When set to SOMETIMES, arrays are used only when multiple items are found. NOTE: When set to NEVER, preserveMarkup is ignored.
  • cdataAsText - Boolean. Default = false. Appends CDATA text to other nearby text instead of putting it into its own $cdata property. NOTE: If you plan to run the toXml() function on data that has CDATA in it, you might. Consider a more robust escape function than what is provided. See below for more information.

Events

All events can be listened to via common nodeJS EventEmitter syntax.

tag:<<TAG_NAME>> - Fires when any <<TAG_NAME>> is parsed. Note that this is case sensitive. If the lowercase option is set, make sure you listen to lowercase tag names. If the strict option is set, match the case of the tags in your document.

end - Fires when the end of the stream has been reached.

error - Fires when there are errors.

query:<<QUERY>> - Coming soon...

toXml Utility

toXml(node, options) - Returns a string, XML-encoding of an object. Encodes $name, $attrs, $text, and $markup as you would expect. the following options are available:

  • indent – How to indent tags for pretty-printing, use ' ' for two-spaces, or '\t' for tabs. If no indent value is provided, output will not be pretty-printed.
  • selfClosing – Whether to self close tags (like <br/> instead of <br></br>) whenever possible. Defaults to true.
  • escape – Optionally provide an escape function for all text, to prevent malformed XML. As a default, a very simplistic escape function has been provided. You can provide a more robust escape function that suits your needs. For example, take a look at he. To turn escaping off, provide a simple, non-escaping function like this: function(str) { return str; }

Authors

License

MIT