@renamoo/lindera-wasm-segmenter
v0.2.1
Published
It is a small library allowing you to use `tokenize` method of [Lindera](https://github.com/lindera-morphology/lindera) by connecting it as a wasm. Primarily for personal use so it currently only supports the basic usage.
Downloads
4
Readme
lindera-wasm-segmenter
It is a small library allowing you to use tokenize
method of Lindera by connecting it as a wasm. Primarily for personal use so it currently only supports the basic usage.
Spec
segment(text:string)
It only takes one argument, a text to be segmented and returns the segmented text as a string with a ,
seperator. (As it seems it's not possible to return Vec via wasm-bindgen for now, but I'm very new to Rust so it can be wrong)
segment_with_details_json(text:string)
It only takes one argument, a text to be segmented and returns the token details in a json format as a string.
Example
index.js
const js = import("./node_modules/@renamoo/lindera-wasm-segmenter/lindera_wasm_segmenter.js");
js.then(js => {
const tokens = js.segment("りんごと牛乳"); // tokens = りんご,と,牛乳
tokens.split(",").forEach(seg => console.log(seg));
// Output:
// りんご
// と
// 牛乳
const detail = js.segment_with_details_json("りんごと");
console.log(detail)
// [
// {
// "text": "りんご",
// "detail": [
// "名詞",
// "一般",
// "*",
// "*",
// "*",
// "*",
// "りんご",
// "リンゴ",
// "リンゴ"
// ]
// },
// {
// "text": "と",
// "detail": [
// "助詞",
// "並立助詞",
// "*",
// "*",
// "*",
// "*",
// "と",
// "ト",
// "ト"
// ]
// }
// ]
console.log(JSON.parse(detail)[0].text) // りんご
});