levi-chinese
v0.1.3
Published
Chinese support for Levi
Downloads
16
Readme
Levi Chinese
Chinese text processing plugins for Levi.
Levi Chinese aims to facilitate Chinese support in Levi full-text search. This is under active development but I am no expert in Chinese NLP. Any comments or PRs are appreciated.
npm install levi-chinese
Levi Chinese provides text processing plugins chinese.converter()
and chinese.segmenter()
.
Mount them under the default plugins of Levi.
var levi = require('levi')
var chinese = require('levi-chinese')
var lv = levi('db')
.use(levi.tokenizer())
.use(levi.stemmer())
.use(levi.stopword())
.use(chinese.converter()) // chinese plugin
.use(chinese.segmenter()) // chinese plugin
lv.pipeline('Lorem Ipsum is dummy text我是拖拉機學院手扶拖拉機專業的。', function (err, tokens) {
// tokens
['lorem', 'ipsum', 'dummi', 'text',
'手扶拖拉机', '拖拉机', '学院', '专业' ]
})
chinese.converter()
Convert Traditional Chinese into Simplified Chinese text tokens. Based on dictionary from Tongwen
chinese.segmenter()
Chinese words segmentation using nodejieba. This requires native bindings so it only works on Node.js.
License
MIT