translitro
v0.2.3
Published
Normalise and transform special characters and non-latin characters (including Chinese and Japanese) to basic latin characters
Downloads
22
Readme
Translitro
Normalise and transform special characters and non-latin characters (including Chinese and Japanese) to basic latin characters.
Perfect for when needing to transform data for searching, sorting and inserting into systems that support only the basic latin character set.
About
This package stands on the shoulders of these giants:
Note: this library is not intended for use within the browser, due to the size of the dependencies (Japanese data files are ~19mb). If you need to normalise/sanitise data, it is best to do it server-side anyway!
Installation
npm i translitro
yarn add translitro
Usage
Within your code, translitro
returns a promise with the transliterated output:
const translitro = require("translitro").default;
translitro("åéîøü").then((o) => console.log(o));
// "aeiou"
import translitro from "translitro";
console.log(await translitro("åéîøü"));
// "aeiou"
It's also possible to transliterate values within arrays and objects with a single call:
const translitro = require("translitro").default;
// Array
translitro(["åéîøü", "مرحبا", "γεια σας", "여보세요"]).then((o) =>
console.log(o)
);
/*
[
"aeiou",
"mrHb",
"geia sas",
"yeoboseyo"
]
*/
// Object
translitro({
specialCharacters: "åéîøü",
arabic: "مرحبا",
greek: "γεια σας",
korean: "여보세요",
}).then((o) => console.log(o));
/*
{
specialCharacters: "aeiou",
arabic: "mrHb",
greek: "geia sas",
korean: "yeoboseyo"
}
*/
Post-processing
After transliterating your input, you can apply some post-processes:
// Object
translitro(
{
specialCharacters: "åéîøü",
arabic: "مرحبا",
greek: "γεια σας",
korean: "여보세요",
},
{
postProcess: ["upper"],
}
).then((o) => console.log(o));
/*
{
specialCharacters: "AEIOU",
arabic: "MRHB",
greek: "GEIA SAS",
korean: "YEOBOSEYO"
}
*/
You can also specify multiple post-processes, including custom functions which take a single string input and return a string:
// Object
translitro(
{
specialCharacters: "åéîøü",
arabic: "مرحبا",
greek: "γεια σας",
korean: "여보세요",
},
{
postProcess: [
"upper",
(i) =>
i
.split(/\s+/g)
.map((o) => o[0])
.join(" "),
],
}
).then((o) => console.log(o));
/*
{
specialCharacters: "A",
arabic: "M",
greek: "G S",
korean: "Y"
}
*/
Current supported post-processes are:
normal
(akanormalize
,normalise
): converts any special characters to basic latin versionsupper
(akauppercase
): does what it sayslower
(akalowercase
): does what it saystitle
(akatitlecase
,capital
,capitalcase
): does what it says
Note: post-processes are performed in the order that they are declared.
Transliterate Chinese and Japanese
When transliterating Chinese and/or Japanese you will have to do those values separately due to needing to specify the input from
parameter:
Chinese
It's necessary to specify { from: "zh" }
in options when transliterating Chinese characters:
translitro(["欢迎", "歡迎"], { from: "zh" }).then((o) => console.log(o));
// ["huan ying", "huan ying"]
You can also affect the output by specifying the to
option (accepts "normal" | "tone2" | "to3ne" | "initials" | "firstLetter"
):
translitro(["欢迎", "歡迎"], { from: "zh", to: "initials" }).then((o) =>
console.log(o)
);
// ["h", "h"]
Simplified and Traditional forms are both handled easily for Chinese characters, thanks to the might of pinyin
Japanese
translitro(["良い一日", "こんにちは", "こうし", "フョ"], {
from: "ja",
}).then((o) => console.log(o));
// ["yoi ichi nichi", "konnichiwa", "ko shi", "fuyo"]
Note: there are three romaji systems supported:
passport
,hepburn
andnippon
.passport
is the default ashepburn
andnippon
can also include special characters outside the standard basic latin set, such as ô and ō.
It's also possible to transliterate Japanese to hiragana and katakana:
translitro(["良い一日", "こんにちは", "こうし" /*, "フョ" */], {
from: "ja",
to: "hiragana",
}).then((o) => console.log(o));
// ["よいいちにち", "こんにちは", "こうし" /*, "ふょ" */]
Note: there's an outstanding issue on kuroshiro where in some cases it can't convert katakana to hiragana.
translitro(["良い一日", "こんにちは", "こうし", "フョ"], {
from: "ja",
to: "katakana",
}).then((o) => console.log(o));
// ["ヨイイチニチ", "コンニチハ", "コウジ", "フョ"]
Japanese katakana, hiragana and kanji forms are handled thanks to the might of kuroshiro and kuroshiro-analyzer-kuromoji
Development
To download external dependencies:
npm i
To run tests (using Jest):
npm test
npm run test:watch
Contribute
Got cool ideas? Have questions or feedback? Found a bug? Post an issue
Added a feature? Fixed a bug? Post a PR