jhtml
v0.5.0
Published
The JHTML format is a strict subset of HTML used to encode arbitrary JSON within HTML. This library seeks to provide conversions while simultaneously validating the indicated JHTML structure(s).
Downloads
8
Readme
JHTML
The JHTML format is a strict subset of HTML used to encode arbitrary JSON (or full JavaScript objects) within HTML. This library seeks to provide conversions while simultaneously validating the indicated JHTML structure(s).
Possible use cases include:
- Hierarchical data storage in a faithful, readily portable and readily viewable format.
- Allow building of data files within (schema-constrained) WYSIWYG editors
- Transforming JSON to XHTML for applying XSL, running XPath, CSS Selector, or DOM queries, etc.
JHTML ought to be round-trippable with canonical JSON except in the case when converting from object-containing JSON to JHTML when the ECMAScript/JSON interpreter does not iterate the properties in definition order (as ECMAScript interpreters are not obliged to do).
Note that when script tags of custom type are available (e.g., <script type="application/json">) it is probably easier to use them with JSON directly.
For representing XML as HTML, see hxml.
See a demo here.
Rules
Currently, comment (and processing instructions) and whitespace text nodes are allowed throughout, but any elements must be constrained to the expected types. For canonicalization, attributes beyond those explicitly allowed should not be present. Microdata might not care about hierarchy, but this specification adds such constraints.
- A top-level JSON string primitive will be encoded by the presence of
<span>
whose contents will be stringified into JSON upon serialization. - Other JSON primitives (
null
, boolean, or number) will be encoded within<i>
, whether at the top-level or elsewhere, with the exact type determined by the contained value (i.e., "null", "true", "false", and any of the allowable formats for a JSON number are the possible values). - JSON arrays (in whatever context) will be encoded as
<ol start="0">
whose individual child items (if any) will be represented by<li>
. Pure text content will indicate a string, whereas a single<i>
child will indicatenull
or a boolean or number type (as per the previous rule). A single<dl>
or<ol start="0">
child will indicate a child object or array respectively (see the next rule for object rules). - JSON objects (in whatever context) will be encoded as
<dl>
whose individual child items (if any) will be represented by alternating<dt>
/<dd>
pairs (only single instances are allowed for each within a pair).<dt>
will represent the keys of the object, whereas<dd>
will represent the values. Pure text content within<dd>
will indicate a string, whereas a single<i>
child of<dd>
will indicatenull
or a boolean or number type (as per the second rule). A single<dl>
or<ol>
child will indicate a child object or array respectively (see the previous rule for array rules). - The top-level element SHOULD include an XHTML namespace declaration (
xmlns="http://www.w3.org/1999/xhtml"
) for polyglot compatibility and MUST contain the attributes,itemscope="" itemtype="http://brett-zamir.me/ns/microdata/json-as-html/2"
.
Design choices
- Adding
null
, boolean, and numbers (if not object keys) be within<i>
visually distinguishes them from strings of the same value. Although this adds some verbosity, and it would technically be possible with CSS to overcome this need, without it, bare HTML would not allow distinguishment between primitive types. - I did not require (or even allow)
itemprop
usage in this version, as it is unnecessarily cumbersome, and would also not be visible within WYSIWYG editors (and thus more prone to error). - It should potentially be able to accommodate other JavaScript objects (e.g.,
undefined
, function (viatoString()
, non-finite numbers, date objects, and regular expression objects ought to appear within <i> without ambiguity).
Node usage
npm install jhtml
var JHTML = require('jhtml');
Brower usage
<script src="jhtml.js"></script>
// The following code will look for all elements within the document
// belonging to the JHTML itemtype namespace (currently:
// http://brett-zamir.me/ns/microdata/json-as-html/1 ).
// Alternatively, one may supply the items as the first (and only)
// argument (there is no validation for namespace currently
// in such a case).
// These return a JSON array if multiple elements are found or a single object otherwise
JHTML.toJSONObject(); // returns a JSON object
JHTML.toJSONString(); // returns a JSON string
Note that if you wish to store the JHTML without displaying it,
you can enclose it within a <script type="jhtml">
element and
obtain the content via script (though you could also obtain
regular JSON in a similar manner or simply use JSON within
your JavaScript). Do not merely add the style display:none
as
this will still cause your JHTML content to display for users
who have disabled CSS.
If you intend to support older browsers, you will need polyfills for:
Array.prototype.map
Array.prototype.reduce
Element.prototype.textContent
Element.prototype.itemProp
HTMLDocument.prototype.getItems
Element.firstElementChild
Possible future todos
- Reimplement JHTML.toJHTMLDOM() using JTLT (when ready))
- Reimplement JHTML.toJHTMLString() using JTLT (when ready))
- Define as ECMAScript 6 Module with polyfill plug-in
- Allow equivalents to JSON.parse's reviver or JSON.stringify's replacer and space arguments?
Possible future spec additions
The following might perhaps be allowed in conjunction with JSON Schema, although I would also like to allow optional encoding of non-JSON JavaScript objects as well.
- This could be expanded to support types like: URL, Date, etc.
- Support a special HTML-aware string type to allow arbitrary nested HTML where JSON strings are expected (which might be encapsulated say by a
<a>
). This could still convert to JSON, but as a string. - Could use itemid/itemref to encode linked references
Possible future spec modifications
The following may loosen requirements, but may not be desirable as they would allow expansion of the size of JHTML files.
- Loosen requirements to allow dropping the start attribute in
<ol start="0">
? For portable proper structural readability, however, this seems like it should stay, even though CSS can mimic the correct 0-indexed display. - Loosen requirements to allow
<span>
on string primitives (for parity with a string at the root) within object keys or object or array keys or values. Currently, the shortest possible expression is required behavior. - Allow
<table>
to be used in place of nested<ol>
arrays especially when there are only two dimensions and the arrays are known to be of equal length at each level (any<thead>
for visual purposes only but not converted to JSON?).
The following are possible tightening or other breaking changes:
- Disallow comment and processing instruction nodes? Despite the precedent with JSON disallowing comments, I am partial to allowing comment nodes in JHTML, despite the burden on implementers, as it is extremely convenient to be able to include such information within data files. Of course, they will not be round-trippable with JSON (unless encoded as a legitimate part of the JSON object) since JSON disallows comments.
- Require primitives to be within
<data>
elements (but the HTML spec currently requires avalue
attribute which would be redundant with the human-readable value). - Change the Microdata attributes on the root to "data-*" attributes since the information is not necessarily semantic (and if it is, it is semantic to the specific JSON format). Although the "data-*" attributes are supposed to only have meaning within the application (e.g., not to be interpreted in a special way by search engines perhaps), their use would not imply that tools could not parse them in a similar manner.
- Move the
itemtype
properties to a container element such as<a>
to avoid the need for an inconsistency with string requiring<span>
at the top level.
The following are other possible changes:
- Change the itemtype namespace if standardized
- Allow multiple
<dd>
's if taken to mean array children? (Probably more confusing even if more succinct than requiring a child<ol>
). - Anything else that comes up out of consultation with others (although I intend to change the namespace upon any breaking changes).
Development
npm install
npm test
or, with nodeunit
installed globally:
npm install
nodeunit test
For browser testing, open test/test.html.
Inspiration
JHTML was inspired by Netscape bookmark files as used when exporting bookmarks in Firefox. They brought to my attention that <dl>
could be used to represent nestable key-value data hierarchies as also found in JSON objects.