npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

lean-he

v2.1.2

Published

A robust HTML entities encoder/decoder with full Unicode support.

Downloads

10,902

Readme

lean-he codecov PRs Welcome GitHub

lean-he (for “HTML entities”) is a robust HTML entity encoder/decoder written in JavaScript. It supports all standardized named character references as per HTML, handles ambiguous ampersands and other edge cases just like a browser would, has an extensive test suite, and — contrary to many other JavaScript solutions — lean-he handles astral Unicode symbols just fine. You can get a hint of how it works from an online demo is available. created by the same creator of original he library. This was created keeping bundling in mind. It will help in creating leaner bundle by using only the specific function developers needs, like if a use case requires only encoding then using lean-he/encode will result in only encode file to be bundle leaving rest of the code hence creating leaner bundle.

It is forked from he with minute changes to make it leaner and all thanks to it's author.

Installation

Via npm:

npm install lean-he

Via Bower:

Coming soon

Via Component:

Coming soon

In Node.js, io.js, Narwhal, and RingoJS:

var he = require('lean-he');

In Rhino:

load('lean-he.js');

API

leanHe.encode(text, options)

This function takes a string of text and encodes (by default) any symbols that aren’t printable ASCII symbols and &, <, >, ", ', and `, replacing them with character references.

Can also use var encode = require('lean-he/encode'); instead to reduce the imported file size if the only need is to encode.

leanHe.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

As long as the input string contains allowed code points only, the return value of this function is always valid HTML. Any (invalid) code points that cannot be represented using a character reference in the input are not encoded:

leanHe.encode('foo \0 bar');
// → 'foo \0 bar'

However, enabling the strict option causes invalid code points to throw an exception. With strict enabled, he.encode either throws (if the input contains invalid code points) or returns a string of valid HTML.

The options object is optional. It recognizes the following properties:

useNamedReferences

The default value for the useNamedReferences option is false. This means that encode() will not use any named character references (e.g. &copy;) in the output — hexadecimal escapes (e.g. &#xA9;) will be used instead. Set it to true to enable the use of named references.

Note that if compatibility with older browsers is a concern, this option should remain disabled.

// Using the global default setting (defaults to `false`):
leanHe.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly disallow named references:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'useNamedReferences': false
});
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly allow named references:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'useNamedReferences': true
});
// → 'foo &copy; bar &ne; baz &#x1D306; qux'

decimal

The default value for the decimal option is false. If the option is enabled, encode will generally use decimal escapes (e.g. &#169;) rather than hexadecimal escapes (e.g. &#xA9;). Beside of this replacement, the basic behavior remains the same when combined with other options. For example: if both options useNamedReferences and decimal are enabled, named references (e.g. &copy;) are used over decimal escapes. HTML entities without a named reference are encoded using decimal escapes.

// Using the global default setting (defaults to `false`):
leanHe.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly disable decimal escapes:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'decimal': false
});
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly enable decimal escapes:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'decimal': true
});
// → 'foo &#169; bar &#8800; baz &#119558; qux'

// Passing an `options` object to `encode`, to explicitly allow named references and decimal escapes:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'useNamedReferences': true,
  'decimal': true
});
// → 'foo &copy; bar &ne; baz &#119558; qux'

encodeEverything

The default value for the encodeEverything option is false. This means that encode() will not use any character references for printable ASCII symbols that don’t need escaping. Set it to true to encode every symbol in the input string. When set to true, this option takes precedence over allowUnsafeSymbols (i.e. setting the latter to true in such a case has no effect).

// Using the global default setting (defaults to `false`):
leanHe.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly encode all symbols:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'encodeEverything': true
});
// → '&#x66;&#x6F;&#x6F;&#x20;&#xA9;&#x20;&#x62;&#x61;&#x72;&#x20;&#x2260;&#x20;&#x62;&#x61;&#x7A;&#x20;&#x1D306;&#x20;&#x71;&#x75;&#x78;'

// This setting can be combined with the `useNamedReferences` option:
leanHe.encode('foo © bar ≠ baz 𝌆 qux', {
  'encodeEverything': true,
  'useNamedReferences': true
});
// → '&#x66;&#x6F;&#x6F;&#x20;&copy;&#x20;&#x62;&#x61;&#x72;&#x20;&ne;&#x20;&#x62;&#x61;&#x7A;&#x20;&#x1D306;&#x20;&#x71;&#x75;&#x78;'

strict

The default value for the strict option is false. This means that encode() will encode any HTML text content you feed it, even if it contains any symbols that cause parse errors. To throw an error when such invalid HTML is encountered, set the strict option to true. This option makes it possible to use he as part of HTML parsers and HTML validators.

// Using the global default setting (defaults to `false`, i.e. error-tolerant mode):
leanHe.encode('\x01');
// → '&#x1;'

// Passing an `options` object to `encode`, to explicitly enable error-tolerant mode:
leanHe.encode('\x01', {
  'strict': false
});
// → '&#x1;'

// Passing an `options` object to `encode`, to explicitly enable strict mode:
leanHe.encode('\x01', {
  'strict': true
});
// → Parse error

allowUnsafeSymbols

The default value for the allowUnsafeSymbols option is false. This means that characters that are unsafe for use in HTML content (&, <, >, ", ', and `) will be encoded. When set to true, only non-ASCII characters will be encoded. If the encodeEverything option is set to true, this option will be ignored.

leanHe.encode('foo © and & ampersand', {
  'allowUnsafeSymbols': true
});
// → 'foo &#xA9; and & ampersand'

Overriding default encode options globally

The global default setting can be overridden by modifying the he.encode.options object. This saves you from passing in an options object for every call to encode if you want to use the non-default setting.

// Read the global default setting:
leanHe.encode.options.useNamedReferences;
// → `false` by default

// Override the global default setting:
leanHe.encode.options.useNamedReferences = true;

// Using the global default setting, which is now `true`:
leanHe.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &copy; bar &ne; baz &#x1D306; qux'

he.decode(html, options)

This function takes a string of HTML and decodes any named and numerical character references in it using the algorithm described in section 12.2.4.69 of the HTML spec.

Can also use var decode = require('lean-he/decode'); instead to reduce the imported file size if the only need is to decode.

leanHe.decode('foo &copy; bar &ne; baz &#x1D306; qux');
// → 'foo © bar ≠ baz 𝌆 qux'

The options object is optional. It recognizes the following properties:

isAttributeValue

The default value for the isAttributeValue option is false. This means that decode() will decode the string as if it were used in a text context in an HTML document. HTML has different rules for parsing character references in attribute values — set this option to true to treat the input string as if it were used as an attribute value.

// Using the global default setting (defaults to `false`, i.e. HTML text context):
leanHe.decode('foo&ampbar');
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly assume an HTML text context:
leanHe.decode('foo&ampbar', {
  'isAttributeValue': false
});
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly assume an HTML attribute value context:
leanHe.decode('foo&ampbar', {
  'isAttributeValue': true
});
// → 'foo&ampbar'

strict

The default value for the strict option is false. This means that decode() will decode any HTML text content you feed it, even if it contains any entities that cause parse errors. To throw an error when such invalid HTML is encountered, set the strict option to true. This option makes it possible to use he as part of HTML parsers and HTML validators.

// Using the global default setting (defaults to `false`, i.e. error-tolerant mode):
leanHe.decode('foo&ampbar');
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly enable error-tolerant mode:
leanHe.decode('foo&ampbar', {
  'strict': false
});
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly enable strict mode:
leanHe.decode('foo&ampbar', {
  'strict': true
});
// → Parse error

Overriding default decode options globally

The global default settings for the decode function can be overridden by modifying the he.decode.options object. This saves you from passing in an options object for every call to decode if you want to use a non-default setting.

// Read the global default setting:
leanHe.decode.options.isAttributeValue;
// → `false` by default

// Override the global default setting:
leanHe.decode.options.isAttributeValue = true;

// Using the global default setting, which is now `true`:
leanHe.decode('foo&ampbar');
// → 'foo&ampbar'

leanHe.escape(text)

This function takes a string of text and escapes it for use in text contexts in XML or HTML documents. Only the following characters are escaped: &, <, >, ", ', and `.

Can also use var escape = require('lean-he/escape'); instead to reduce the imported file size if the only need is to escape.

leanHe.escape('<img src=\'x\' onerror="prompt(1)">');
// → '&lt;img src=&#x27;x&#x27; onerror=&quot;prompt(1)&quot;&gt;'

leanHe.unescape(html, options)

leanHe.unescape is an alias for leanHe.decode. It takes a string of HTML and decodes any named and numerical character references in it.

Can also use var unescape = require('lean-he/unescape'); instead to reduce the imported file size if the only need is to unescape.

Unit tests & code coverage

After cloning this repository, run npm install to install the dependencies needed for he development and testing.

Once that’s done, you can run the unit tests in Node using npm test.

Code coverage report will be generated in coverage directory. Code coverage data will be presented as:

  • html
  • json
  • text

Acknowledgements

Thanks to Mathias Bynens for creating he.

Author

| twitter/adnaan1703 | |---| |GitHub followers | | Twitter URL |

License

lean-he is available under the MIT license.