xml-parse

v0.4.0

Published

3 years ago

XML DOM, Parser & Stringifier

Downloads

15,206

0High
0Medium
0Low

mauriceconrad

xml xml parser xml dom xml stringify xml parse xml stringifier dom html html parser html parsing xml parsing

XML Parser, Stringifier and DOM

Parse XML, HTML and more with a very tolerant XML parser and convert it into a DOM.

These three components are separated from each other as own modules.

|Component|Size| |---|---| |Parser|4.7 KB| |Stringifier|1.3 KB| |DOM|3.1 KB|

Install

npm install xml-parse

Require

const xml = require("xml-parse");

Parser

Parsing is very simple.

Just call the parse method of the xml-parse instance.

const xml = require("xml-parse");

// Valid XML string
var parsedXML = xml.parse('<?xml version="1.0" encoding="UTF-8"?>' +
                          '<root>Root Element</root>');
console.log(parsedXML);

// Invalid XML string
var parsedInavlidXML = xml.parse('<root></root>' +
                                 '<secondRoot>' +
                                   '<notClosedTag>' +
                                 '</secondRoot>');
console.log(parsedInavlidXML);

Parsed Object Structure

The result of parse is an object that maybe looks like this:

(In this case we have the xml string of the given example)

[
  {
    type: 'element',
    tagName: '?xml',
    attributes: {
      version: '1.0',
      encoding: 'UTF-8'
    },
    childNodes: [],
    innerXML: '>',
    closing: false,
    closingChar: '?'
  },
  {
    type: 'element',
    tagName: 'root',
    attributes: {},
    childNodes: [
      {
        type: 'text',
        text: 'Root Element'
      }
    ],
    innerXML: 'Root Element',
    closing: true,
    closingChar: null
  }
]

The root object is always an array because of the fact that it handles invalid xml with more than one root element.

Object Nodes

There are two kinds of objects. element and text. An object has always the property type. The other keys depend from this type.

'Element' Object Node

{
  type: [String], // "element"
  tagName: [String], // The tag name of the tag
  attributes: [Object], // Object containing attributes as properties
  childNodes: [Array], // Array containing child nodes as object nodes ("element" or "text")
  innerXML: [String], // The inner XML of the tag
  closing: [Boolean], // If the tag is closed typically (</tagName>)
  closingChar: [String] || null // If it is not closed typically, the char that is used to close it ("!" or "?")
}

'Text' Object Node

{
  type: [String], // "text"
  text: [String] // Text contents of the text node
}

Stringifier

The stringifier is the simplest component. Just pass a parsed object structure.

const xml = require("xml-parse");

var xmlDoc = [
  {
    type: 'element',
    tagName: '?xml',
    attributes: {
      version: '1.0',
      encoding: 'UTF-8'
    },
    childNodes: [],
    innerXML: '>',
    closing: false,
    closingChar: '?'
  },
  {
    type: 'element',
    tagName: 'root',
    attributes: {},
    childNodes: [
      {
        type: 'text',
        text: 'Root Element'
      }
    ],
    innerXML: 'Root Element',
    closing: true,
    closingChar: null
  }
]

var xmlStr = xml.stringify(xmlDoc, 2); // 2 spaces

console.log(xmlStr);

DOM

The DOM method of xml-parser instance returns a Document-Object-Model with a few methods. It is oriented on the official W3 DOM but not complex as the original.

const xml = require("xml-parse");

var xmlDoc = new xml.DOM(xml.parse('<?xml version="1.0" encoding="UTF-8"?>' +
                                     '<root>Root Element</root>')); // Can also be a file path.

xmlDoc.document; // Document Object. (Root)

'Element' Object Node

// An element (e.g the 'document' object) has the following prototype methods and properties:

var objectNode = document.childNodes[1]; // Just an example

// This is the return of a object node element

objectNode = {
  type: 'element',
  tagName: 'tagName',
  attributes: [Object],
  childNodes: [Object],
  innerXML: 'innerXML',
  closing: true,
  closingChar: null,
  getElementsByTagName: [Function], // Returns all child nodes with a specific tagName
  getElementsByAttribute: [Function], // Returns all child nodes with a specific attribute value
  removeChild: [Function], // Removes a child node
  appendChild: [Function], // Appends a child node
  insertBefore: [Function], // Inserts a child node
  getElementsByCheckFunction: [Function], // Returns all child nodes that are validated by validation function
  parentNode: [Circular] // Parent Node
}

Handling with child nodes

With appendChild or insertBefore methods of every object node, you are allowed to append a child node. You do not have to do something like createElement.

Because a child node is just an object literal, with some properties like type, tagName, attributesand more you just have to pass such an object to the function.

appendChild

element.appendChild(childNode); // ChildNode is just a object node

Example

const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('<?xml version="1.0" encoding="UTF-8"?>' +
                                     '<root>Root Element</root>'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

root.appendChild({
  type: "element",
  tagName: "appendedElement",
  childNodes: [
    {
      type: "text",
      text: "Hello World :) I'm appended!"
    }
  ]
});

insertBefore

element.insertBefore(childNode, elementAfter); // ChildNode is just an object literal, 'elementAfter' is just a child node of the parent element

Example

const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('<?xml version="1.0" encoding="UTF-8"?>' +
                                     '<root>Root Element</root>'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

root.insertBefore({
  type: "element",
  tagName: "insertedElement",
  childNodes: [
    {
      type: "text",
      text: "Hello World :) I'm appended!"
    }
  ]
}, root.childNodes[0]);

removeChild

element.removeChild(childNode); // 'childNode' is just a children of the parent element ('element')

Example

const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('<?xml version="1.0" encoding="UTF-8"?>' +
                                     '<root>Root Element</root>'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

root.removeChild(root.childNodes[0]);

parentNode

The parentNode of a object node represents its parent element. It's a [Circular] reference.

const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('<?xml version="1.0" encoding="UTF-8"?>' +
                                     '<root>Root Element</root>'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

console.log(root.childNodes[0].parentNode); // Returns the 'root' element

Get child nodes

getElementsByTagName

element.getElementsByTagName("myTagName"); // Returns all elements whose tag name is 'myTagName'

getElementsByAttribute

element.getElementsByAttribute("myAttribute", "myAttributeValue"); // Returns all elements whose attribute 'myAttribute' is 'myAttributeValue'

getElementsByCheckFunction

// With this method you can set custom 'get' methods.
element.getElementsByCheckFunction(function(element) {
  if (element.type === "element" && element.childNodes.length == 30) {
    return true;
  }
}); // Returns all elements that have exactly 30 childNodes

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

XML Parser, Stringifier and DOM

Install

Require

Parser

Parsed Object Structure

Object Nodes

'Element' Object Node

'Text' Object Node

Stringifier

DOM

'Element' Object Node

Handling with child nodes

appendChild

Example

insertBefore

Example

removeChild

Example

parentNode

Get child nodes

getElementsByTagName

getElementsByAttribute

getElementsByCheckFunction