characterset
v2.0.0
Published
A library for working with Unicode character sets
Downloads
11,166
Maintainers
Readme
CharacterSet
CharacterSet is a library for creating and manipulating Unicode character sets in JavaScript. Its main purpose is to help in building regular expressions for validation and matching. It fully supports all Unicode characters and correctly handles surrogate pairs in JavaScript strings and regular expressions.
Installation
If you are using Node.js you can install it using npm:
$ npm install characterset
If you want to use CharacterSet in the browser, use the global CharacterSet
constructor or include CharacterSet as an AMD module.
API
The constructor takes a single input value, which can either be a number, a string or a range. A range is an array of numbers or number pairs.
// Creates a character set with a single code point for [97]
var cs = new CharacterSet(97);
// Creates a character set for the code points [97, 98, 99]
var cs = new CharacterSet('abc');
// Creates a character set for the code points [97, 98, 99]
var cs = new CharacterSet([97, 98, 99]);
// Creates a character set for the code points [97, 98, 99] using a range
var cs = new CharacterSet([[97, 99]]);
// Combines pairs and numbers in ranges for [0, 97, 98, 99]
var cs = new CharacterSet([48, [97, 99]]);
Or you can use the parseUnicodeRange
method to return a CharacterSet instance from a comma-delimited unicode range string.
// Creates a character set for the code points [34, 35]
var cs = CharacterSet.parseUnicodeRange('u+23,u+22');
// Creates a character set for the code points [34, 35, 36, 37]
var cs = CharacterSet.parseUnicodeRange('u+22-25');
Once you have an instance of CharacterSet you can use the following methods on it:
License
CharacterSet is licensed under the three clause BSD license (see BSD.txt.)