tibetan-ewts-converter
v2.0.0
Published
Tibetan transliteration (Wylie, EWTS) and approximate phonetics
Maintainers
Readme
Tibetan Phonetics and Transliteration
This JavaScript package implements two things:
conversion between Unicode Tibetan text and Extended Wylie transliteration (EWTS)
approximate Tibetan phonetics according to THL and other systems.
Installation
npm install tibetan-ewts-converter
As of version 2, this is a pure ES module.
Usage
Wylie/EWTS conversion:
import { EwtsConverter } from 'tibetan-ewts-converter/EwtsConverter';
const ewts = new EwtsConverter();
console.log(ewts.to_unicode("sangs rgyas"));
console.log(ewts.to_ewts("སངས་རྒྱས"));Approximate phonetics:
import { get_phonetics } from 'tibetan-ewts-converter';
const pho = get_phonetics({ style: "lotsawahouse", lang: "en" });
console.log(pho.phonetics("sangs rgyas", { autosplit: true }));EwtsConverter options
The constructor accepts an optional object with named options:
check: generate warnings for illegal consonant sequences and the like; default istrue.check_strict: stricter checking, examine the whole stack; default istrue.fix_spacing: remove spaces after newlines, collapse multiple tseks into one, fix case, etc; default istrue.sloppy: silently fix a number of common Wylie mistakes when converting to Unicode; default isfalseleave_dubious: when converting to Unicode, leave dubious syllables unprocessed, between [brackets], instead of doing a best attempt; default isfalsepass_through: when converting to EWTS, pass through non-Tib characters instead of converting to [comments]; default isfalse
let ewts = new EwtsConverter({ check_strict: false, leave_dubious: true, sloppy: true });TibetanPhonetics options
get_phonetics accepts an optional object with named options:
style: one of 'thl', 'lotsawahouse', 'rigpa', 'lhasey', 'padmakara'lang: 2-letter language code, for styles that have language variants (ex. 'en', 'es')
The phonetics method takes a string (Tibetan Unicode or EWTS), and an optional options object.
Unless you're using a better external tokenizer, always pass the option { autosplit: true }.
See the code for lots of other options allowing fine control of phonetics generation. You can also directly import and use the classes TibetanPhonetics, TibetanPhoneticsRigpa, TibetanPhoneticsLhasey and TibetanPhoneticsPadmakara.
Code and history
The first version of this code was written in Perl around 2008. In 2010 the EWTS/Unicode converter was ported to Java at the request of TBRC, now BDRC.
The Java code for phonetics was then ported to other languages by different groups:
- Python port by Esukhia
- C# port by radiantspace
- Another Python port by radiantspace
- JavaScript ports from BDRC, Ksana Forge and Karmapa Digital Toolbox
- This Javascript port of 2021, going back to the original Perl code, but incorporating some of the improvements done by various groups.
Phonetics generation was added to this project in 2025, also ported from the original perl with the help of AI.
License
Apache 2.0.
