@hugov/shorter-string
v6.1.0
Published
short string compression using Burrows-Wheeler transform, move to front and Elias-gamma variable length encoding
Downloads
47
Maintainers
Readme
shorter-string
small string to string compression for short strings using Burrows-Wheeler transform (BWT), move to front (MTF) and Elias-gamma variable length encoding
• Example • API • Notes • License
Example
import {encode, decode} from './index.js'
const text = `Si six chasseurs savent chasser sans six chiens, soixante-six chasseurs savent chasser sans soixante-six chiens.`,
const code = encode(text)
// 'PFp_Dd#sCa;F/f+QjYY/IEqySuqvq2JQ&=a2*?org~9T+r.:qp,yW6q
console.log( decode(code) === text), )
// true, with 50% compression in character length
API
exports | Note
------------------ | -------------------------------
string constants|
BASE62 | 0-9A-Za-z
BASE64 | BASE62 + -_
UNRESERVED | BASE62 + -._~
; RFC 3986 base:66
PCHAR | UNRESERVED + %!$&'()*+,;=:@
; RFC 3986 base:80
QUERY | PCHAR without '
for chrome base:79
RFC1924 | RFC1924 base:85
HASH | PCHAR + /?#
; base:83
functions |
encode | ( text:string, [keys:string=HASH] ) => code:string
decode: | ( code:string, [keys:string=HASH] ) => text:string
Notes
- inpired from the blog post reddad.ca/2020/09/27/burrows-wheeler-revisited
- modified to facilitate URI friendly encoding
- optimized for short strings
- not optimized for large inputs
- other alternatives considered
- lz-string (small but no es6 exports and not the best compression for URI components)
- lzbase62 (better compression)
- lzutf8 (best compression, too big at 68.5 kb minified, no es6 exports)