jaxine
v3.0.0
Published
Another JSON to XML converter with inheritable attributes and custom element consolidation
Downloads
1
Readme
jaxine
Another XML to JSON converter but with additional attribute inheritance and element consolidation. Jaxine uses xmldom and xpath as its primary dependencies in performing XML to JSON conversion. It also takes a slightly different approach to parsing XML in that it is selective in nature, reflecting how clients would use an xpath expression to selectively access certain portions of a document rather than processing the document as a whole.
Install
npm install jaxine
Transformation
:cyclone: Attributes on an element will appear as a member variable of the same name on the current object.
:snowflake: Element's name will be populated as a member variable named "_" on the current object.
:high_brightness: Descendants will be constructed into an array keyed by the literal "_children" on the current object.
Examples
Simple XML element
Given a string containing XML content:
const data = `<?xml version="1.0"?>
<Application name="pez">
<Cli>
<Commands>
<Command name="leaf" describe="this is a leaf command" type="native"/>
</Commands>
</Cli>
</Application>`;
This (building element '/Application') will be translated into the following JSON structure:
{
'name': 'pez', // @attribute
'_': 'Application', // *element-name
'_children': [{ // <-- "_children" array
'_': 'Cli', // *
'_children': [{ // <--
'_': 'Commands', // *
'_children': [{ // <--
'name': 'leaf', // @
'describe': 'this is a leaf command', // @
'type': 'native', // @
'_': 'Command' // *
}]
}]
}]
};
Inherited and Consolidated Elements via id attribute (in this case id is set to "name", see API)
Given a string containing XML content:
const data = `<?xml version="1.0"?>
<Application name="pez">
<Cli>
<Commands>
<Command name="base-command" abstract="true" source="filesystem-source" enc="mp3">
<Arguments>
<ArgumentRef name="loglevel"/>
<ArgumentRef name="logfile"/>
</Arguments>
</Command>
<Command name="domain-command" abstract="true">
<Arguments>
<ArgumentRef name="aname"/>
<ArgumentRef name="afname"/>
<ArgumentRef name="header"/>
</Arguments>
<ArgumentGroups>
<Conflicts>
<ArgumentRef name="name"/>
<ArgumentRef name="fname"/>
</Conflicts>
<Implies>
<ArgumentRef name="aname"/>
<ArgumentRef name="afname"/>
</Implies>
<Conflicts>
<ArgumentRef name = "header"/>
<ArgumentRef name = "gname"/>
<ArgumentRef name = "cgname"/>
</Conflicts>
</ArgumentGroups>
</Command>
<Command name="uni-command" abstract="true">
<Arguments>
<ArgumentRef name="path"/>
<ArgumentRef name="filesys"/>
<ArgumentRef name="tree"/>
</Arguments>
</Command>
<Command name="rename"
enc="flac"
describe="Rename albums according to arguments specified (write)."
inherits="base-command,domain-command,uni-command"> <!-- multiple inheritance -->
<Arguments>
<ArgumentRef name="with"/>
<ArgumentRef name="put"/>
</Arguments>
</Command>
</Commands>
</Cli>
</Application>`;
... (building element '/Application/Cli/Commands/Command[@name="rename"]') translates to the following JSON:
{
'name': 'rename',
'source': 'filesystem-source', // inherited from Command name="base-command"
'enc': 'flac', // Overrides the enc value in inherited Command name="base-command"
'_': 'Command',
'_children': [{ // These children are those of the current element Command name="rename"
'_': 'Arguments',
'_children': [{
'name': 'with',
'_': 'ArgumentRef'
}, {
'name': 'put',
'_': 'ArgumentRef'
}]
}, { // <-- Children of element Command name="base-command"
'_': 'Arguments',
'_children': [{
'name': 'loglevel',
'_': 'ArgumentRef'
}, {
'name': 'logfile',
'_': 'ArgumentRef'
}]
}, { // <-- first child of element Command name="domain-command"
'_': 'Arguments',
'_children': [{
'name': 'aname',
'_': 'ArgumentRef'
}, {
'name': 'afname',
'_': 'ArgumentRef'
}, {
'name': 'header',
'_': 'ArgumentRef'
}]
}, { // <-- second child of element Command name="domain-command"
'_': 'ArgumentGroups',
'_children': [{
'_': 'Conflicts',
'_children': [{
'name': 'name',
'_': 'ArgumentRef'
}, {
'name': 'fname',
'_': 'ArgumentRef'
}]
}, {
'_': 'Implies',
'_children': [{
'name': 'aname',
'_': 'ArgumentRef'
}, {
'name': 'afname',
'_': 'ArgumentRef'
}]
}, {
'_': 'Conflicts',
'_children': [{
'name': 'header',
'_': 'ArgumentRef'
}, {
'name': 'gname',
'_': 'ArgumentRef'
}, {
'name': 'cgname',
'_': 'ArgumentRef'
}]
}]
}, { // <-- Children of element Command name="uni-command"
'_': 'Arguments',
'_children': [{
'name': 'path',
'_': 'ArgumentRef'
}, {
'name': 'filesys',
'_': 'ArgumentRef'
}, {
'name': 'tree',
'_': 'ArgumentRef'
}]
}],
'describe': 'Rename albums according to arguments specified (write).'
};
:exclamation: Points of note in this example are:
- Element consolidation and attribute inheritance done via id attribute "name". This means that any attribute defined on an inherited element, will be defined on the inheriting element. So in this example, the XML defines Command element with an attribute name="rename" and inherits="base-command,domain-command,uni-command". So all attributes defined in Command with name = "base-command", "domain-command" or "uni-command" will be inherited.
- Any attribute defined on an element will override that with the same name from an inherited element, so "enc" is set to "flac" not "mp3" as would have been inherited from Command name="base-command".
- When multiple inheritance is encountered, the right most reference takes precendence, so when inherits="base-command,domain-command,uni-command" is encountered, attributes in uni-command take precedence over domain-command.
- the children of inherited elements become separate entries in the _children for the current element.
- Notice how the abstract attribute has been dropped from the generated JSON representation. This is because the user can also define a set of discarded attributes (see API).
Text nodes
Given a string containing XML content containing text of different types (raw and CDATA)
const data = `<?xml version="1.0"?>
<Application name="pez">
<Expressions name="content-expressions">
<Expression name="meta-prefix-expression">
<Pattern eg="TEXT"> SOME-RAW-TEXT <![CDATA[ .SOME-CDATA-TEXT ]]> <![CDATA[ .SOME-MORE-CDATA-TEXT ]]></Pattern>
</Expression>
</Expressions>
</Application>`;
... (building element '/Application/Expressions/Expression[@name="meta-prefix-expression"]') translates to the following JSON:
{
'name': 'meta-prefix-expression',
'_': 'Expression',
'_children': [{
'eg': 'TEXT',
'_': 'Pattern',
'_text': 'SOME-RAW-TEXT.SOME-CDATA-TEXT.SOME-MORE-CDATA-TEXT' // consolidated text from multiple text nodes
}]
}
:exclamation: Points of note in this example are:
- Raw text and CDATA text nodes defined for the same element are combined into a single text field on the current object.
- Using xmldom, there is no distinction between Comment nodes and raw/CDATA text nodes, so unfortunately, comments are read in and appear as text. (As long as there no comments in places where real text is expected, this shouldn't be an issue.)
The API
Consists of functions: buildElement, buildElementWithSpec, validateSpec, validateOptions and a list of predefined spec objects: specs
buildElement
buildElement(elementNode, parentNode, getOptions)
- elementNode: the element as selected via xpath, which needs to be translated
- parentNode: the parent node of elementNode
- getOptions: a callback function which accepts a single argument of type string, which indicates the name of the
element currently being built. This is invoked for all descendant elements and allows defining options on a per element type basis. The options returned by getOptions can define the following members:
- id: The name of the attribute that serves as an identifier to distinguish elements of the same type. This is also the attribute used for index/group by (see descendants below)
- recurse: The name of the attribute through which inheritance is invoked.
- discards: An array containing a list of strings defining the attributes which should be discarded and not be present on the resultant JSON representation.
- descendants: (Sub-object, which can contain the following members):
- by: Determines how descendants are structured ("index" | "group"). By default, descendants will be stored as an array. Alternatively, they can be restructured into a map object where each descendant is keyed by the attribute. (when by = "index": value of attribute must be unique, when by = "group" then attribute value does not have to be unique. In this case, descendants with the same name will be grouped into an array). If not present, then index/group by function will not be executed and descendants will be built as an array not an indexable map.
- throwIfCollision: "throwIfCollision": If there are multiple child elements that have the same key value (descendants.attribute), then the groupBy/indexBy function will not be invoked and they will be returned as an array (and hence not indexable). If throwIfCollision is true, then an exception will be thrown (does not apply to groupBy/by="group")
- throwIfMissing: Similar to throwIfCollision, but an exception will be thrown if the child element does not contain the attribute as defined in descendants.attribute).
The following shows an example using the buildElement function:
const DOMParser = require('xmldom').DOMParser;
const parser = new DOMParser();
const xpath = require('xpath');
const R = require('ramda');
const jaxine = require('jaxine')
const optionsMap = {
'DEFAULT': { },
'Command': {
id: 'name',
recurse: 'inherits',
discards: ['inherits', 'abstract'],
descendants: {
by: 'index',
throwIfMissing: true,
throwIfCollision: true
}
},
'Tree': {
id: 'alias',
descendants: {
by: 'index',
throwIfMissing: true,
throwIfCollision: true
}
}
};
const getTestOptions = (el) => {
return R.includes(el, R.keys(optionsMap)) ? optionsMap[el] : optionsMap['DEFAULT'];
};
it('An Example', () => {
const data = `<?xml version="1.0"?>
<Application name="pez">
<Cli>
<Commands>
<Command name="leaf" describe="this is a leaf command" type="native"/>
</Commands>
</Cli>
</Application>`;
const document = parser.parseFromString(data);
const applicationNode = xpath.select(`.//Application[@name="pez"]`, document)
if (applicationNode) {
const spec = jaxine.specs.default;
let application = jaxine.buildElement(applicationNode, document, getTestOptions, spec);
console.log(`>>> APPLICATION: ${JSON.stringify(application)}`);
}
:exclamation: Points of note in this example are:
The parent node in this case was the document root (as obtained from the XML parser). We could easily have selected a different parent node, by using the xpath API to select a different node, and using that as the parent.
The getOptions function here uses the optionsMap as illustrated. The optionsMap has an entry for the Command element, which means that options object will be used for any Command element encountered. getTestOptions will use the DEFAULT entry for any element that is processed that is neither Command or Tree. If you invoke buildElement on the Command node (in this case), the callback (getOptions) will be invoked just once with the element name set to 'Command'. The callback allows you to define a different options object for each element type encountered whilst processing the descendants of the element you originally invoked buildElement on.
The additional build options that come as part of the spec, apply to the root element (ie the element that build element is called upon) and the entire descendants tree that is derived from it. This is in contrast to getOptions, (explained above) which is invoked for all descendants of the the root element.
Depending on the XML being processed, it is highly likely that different options objects are used to parse different parts of the document as appropriate.
buildElementWithSpec
buildElementWithSpec(elementNode, parentNode, spec, getOptions)
Is a curried function which allows a custom buildElementXXX function to be defined which is bound to a custom spec. This method should be used if the default spec is not to be used. The user can pass in one of the predefined specs or can define their own.
- elementNode: (see buildElement)
- parentNode: (see buildElement)
- spec: Describes structure of the built JSON object and additional build options
- getOptions: (see buildElement)
validateOptions
validateOptions(options)
- options: The options object to validate
An example of the options object is as follows:
{
id: 'name',
recurse: 'inherits',
discards: ['abstract', 'describe']
descendants: {
by: 'group',
throwIfCollision: true,
throwIfMissing: true
}
}
indexBy (descendants .by="index")
Given the following XML fragment:
<Arguments>
<ArgumentRef name="name"/>
<ArgumentRef name="header"/>
<ArgumentRef name="producer"/>
<ArgumentRef name="director"/>
</Arguments>
using spec.descendants .by = 'index' yields the following JSON:
{
'_': 'Arguments',
'_children': {
'name': {
'name': 'name',
'_': 'ArgumentRef'
},
'header': {
'name': 'header',
'_': 'ArgumentRef'
},
'producer': {
'name': 'producer',
'_': 'ArgumentRef'
},
'director': {
'name': 'director',
'_': 'ArgumentRef'
}
}
}
This shows that the Arguments element contains a _children property which is a map object, keyed by the value of the spec.descendants.attribute, in this case "name". Note how the values of "name"s are indeed unique and map to their corresponding ArgumentRef objects.
groupBy (descendants .by="group")
Given the following XML fragment:
<Arguments>
<ArgumentRef name="producer"/>
<ArgumentRef name="director" discriminator="A"/>
<ArgumentRef name="director" discriminator="B"/>
</Arguments>
using spec.descendants .by = 'group' yields the following JSON:
{
'_': 'Arguments',
'_children': {
'producer': [{
'name': 'producer',
'_': 'ArgumentRef'
}],
'director': [{
'name': 'director',
'discriminator': 'A',
'_': 'ArgumentRef'
}, {
'name': 'director',
'discriminator': 'B',
'_': 'ArgumentRef'
}]
}
}
This shows that the Arguments element contains a _children property which is a map object, keyed by the value of the spec.descendants.attribute, in this case "name". This time however, the map object keys to an array which contains the collection of ArgumentRef children which all have the attribute equal to that key, ie the "name" attribute. In this case, we can see that there are 2 children whose "name" attribute are identical ("director"), so they both appear in the array whose key is "director".
descendants.by not specified
Use options without descendants.by setting when having the children of a particular element being populated simply as an array, so for example, the following xml
<Arguments>
<ArgumentRef name="producer"/>
<ArgumentRef name="director" discriminator="A"/>
<ArgumentRef name="director" discriminator="B"/>
</Arguments>
yields the following JSON:
{
'_': 'Arguments',
'_children': [{
'name': 'producer',
'_': 'ArgumentRef'
}, {
'name': 'director',
'discriminator': 'A',
'_': 'ArgumentRef'
}, {
'name': 'director',
'discriminator': 'B',
'_': 'ArgumentRef'
}]
}
This shows that the Arguments element contains a _children property which is simply an array containing all of the ArgumentRef children of Arguments.
validateSpec
Ensures that spec object is valid, throws if not. The user can invoke validateSpec independently of building an element.
validateSpec(spec)
- spec: The spec to validate
:cyclone: (sub-object) descendants:
:high_brightness: (string) by: Determines how descendants are structured ("index" | "group"). By default,descendants will be stored as an array. Alternatively, they can be restructured into a map object where each descendant is keyed by the attribute. (when by = "index": value of attribute must be unique, when by = "group" then attribute value does not have to be unique. In this case, descendants with the same name will be grouped into an array). See indexBy and groupBy sections below for more details.
:high_brightness: (string) attribute: The name of the attribute to index/group by.
:high_brightness: (boolean) [default:false] throwIfCollision: If there are multiple child elements that have the same key value (descendants.attribute), then the groupBy/indexBy function will not be invoked and they will be returned as an array (and hence not indexable). If throwIfCollision is true, then an exception will be thrown (does not apply to groupBy/by="group").
:high_brightness: (boolean) [default:false] throwIfMissing: Similar to throwIfCollision, but an exception will be thrown if the child element does not contain the attribute as defined in descendants.attribute.
The Spec
An example of the spec object is as follows:
{
labels: {
element: '_',
descendants: '_children',
text: '_text',
attributes: '_attributes'
},
trim: true
}
:cyclone: (sub-object) labels:
:high_brightness: (string) element: The name of the property that stores the XML element name.
:high_brightness: (string) descendants: The name of the property holding descendant XML elements structure. (typically set to "_children")
:high_brightness: (string) text: The name of the property holding text local to the XML element as raw text or CDATA text. (typically set to "_text")
:high_brightness: (string) [optional] attributes: The name of the property that stores the attributes array.
:cyclone: (boolean) [default:true] trim: Removes leading and trailing spaces from element text.
Pre-defined specs
const jaxine = require('jaxine');
const attributesAsArraySpec = jaxine.specs.attributesAsArraySpec;
const optionsMap = {
'DEFAULT': { },
'Command': {
id: 'name',
recurse: 'inherits',
discards: ['inherits', 'abstract'],
descendants: {
by: 'index',
throwIfMissing: true,
throwIfCollision: true
}
}
};
const getOptions = (el) => {
return (optionsMap[el] || optionsMap['DEFAULT']);
};
jaxine.buildElementWithSpec( ..., attributesAsArraySpec, getOptions);
Using the default spec
The buildElement function uses the default spec:
const jaxine = require('jaxine');
jaxine.buildElement( ..., getOptions);