tibetan-sort-js
v2.1.4
Published
Tibetan string sorting according to customary native order
Downloads
7
Readme
JS Library to sort Tibetan
After exploring different options for tibetan collation in JavaScript, there seems to be no viable option to sort Unicode Tibetan strings. This library hopes to fullfill this purpose in an elegant, modern and efficient manner.
State of the art
The most logical option to sort Tibetan would by using Intl.Collator. The problem is that all browsers seems to use ICU to implement this object, and ICU has a bug on Tibetan collation, which won't be fixed in the short term. It will take even more time for the fix to appear in mainstream browsers, so it's not even a middle term solution. Bugs have been filled for Firefox, ChakraCore, Chrome and Safari.
Pure Javascript implementations of Intl.Collator
don't seem to exist, as the only Intl
polyfill doesn't support it.
The only library we found that would be of possible use is lasca, but it proved very buggy and extremely inefficient.
This implementation
This implementation aims at being very efficient, at the cost of difficult corner cases in Tibetan. As a consequence:
- it does not normalize strings (
\u0F77
is not treated like\u0FB2\u0F71\u0F80
) - it does not handle Sanskrit stacks very precisely (the ICU rule
&ཀར<ཀརྐ
is too difficult to handle)
Installation
yarn add tibetan-sort-js --save
API
compare
Compares two strings in Tibetan Unicode, can be used as argument of Array.compare(). The behavior is undefined if the arguments are not strings. Doesn't workswell with non-Tibetan strings.
Parameters
Returns number 0 if equivalent, 1 if a > b, -1 if a < b
compareEwts
Compares two strings in EWTS, has the same argument and return value as compare
. The function only works on customary EWTS and doesn't handle oddly encoded cases such as b.r+g+ya
(instead of brgya
).
TODO
- add an option to normalize strings?
Release history
See change log.
License
The code is Copyright 2017-2019 Buddhist Digital Resource Center, and is provided under the MIT License.