@ladjs/naivebayes
v0.1.0
Published
Naive Bayes Classifier for JavaScript.
Downloads
438
Keywords
Readme
@ladjs/naivebayes
A ladjs naivebayes package forked from surmon-china/naivebayes
Table of Contents
What can I use this for
Naive-Bayes classifier for JavaScript.
naivebayes
takes a document (piece of text), and tells you what category that document belongs to.
You can use this for categorizing any text content into any arbitrary set of categories. For example:
- Is an email spam, or not spam ?
- Is a news article about technology, politics, or sports ?
- Is a piece of text expressing positive emotions, or negative emotions?
Install
npm
npm install @ladjs/naivebayes
yarn
yarn add @ladjs/naivebayes
Usage
const NaiveBayes = require('naivebayes')
const classifier = new NaiveBayes()
// teach it positive phrases
classifier.learn('amazing, awesome movie!! Yeah!! Oh boy.', 'positive')
classifier.learn('Sweet, this is incredibly, amazing, perfect, great!!', 'positive')
// teach it a negative phrase
classifier.learn('terrible, cruddy thing. Damn. Sucks!!', 'negative')
// now ask it to categorize a document it has never seen before
classifier.categorize('awesome, cool, amazing!! Yay.')
// => 'positive'
// serialize the classifier's state as a JSON string.
const stateJson = classifier.toJson()
// load the classifier back from its JSON representation.
const revivedClassifier = NaiveBayes.fromJson(stateJson)
const NaiveBayes = require('naivebayes')
const Segment = require('segment')
const segment = new Segment()
segment.useDefault()
const classifier = new NaiveBayes({
tokenizer(sentence) {
const sanitized = sentence.replace(/[^(a-zA-Z\u4e00-\u9fa50-9_)+\s]/g, ' ')
return segment.doSegment(sanitized, { simple: true })
}
})
API
Class
const classifier = new NaiveBayes([options])
Returns an instance of a Naive-Bayes Classifier.
Options
tokenizer(text)
- (type:function
) - Configure your own tokenizer.vocabularyLimit
- (type:number
default: 0) - Reference a max word count where0
is the default, meaning no limit.stopwords
- (type:boolean
default: false) - To remove stopwords from text
Eg.
const classifier = new NaiveBayes({
tokenizer(text) {
return text.split(' ')
}
})
Learn
classifier.learn(text, category)
Teach your classifier what category
the text
belongs to. The more you teach your classifier, the more reliable it becomes. It will use what it has learned to identify new documents that it hasn't seen before.
Probabilities
classifier.probabilities(text)
Returns an array of { category, probability }
objects with probability calculated for each category. Its judgement is based on what you have taught it with .learn()
.
Categorize
classifier.categorize(text ,[probability])
Returns the category
it thinks text
belongs to. Its judgement is based on what you have taught it with .learn()
.
ToJson
classifier.toJson()
Returns the JSON representation of a classifier. This is the same as JSON.stringify(classifier.toJsonObject())
.
ToJsonObject
classifier.toJsonObject()
Returns a JSON-friendly representation of the classifier as an object
.
FromJson
const classifier = NaiveBayes.fromJson(jsonObject)
Returns a classifier instance from the JSON representation. Use this with the JSON representation obtained from classifier.toJson()
.
Debug
To run naivebayes
in debug mode simply set DEBUG=naivebayes
when running your script.
Contributors
| Name | Website | | ---------------- | -------------------------- | | Surmon | http://surmon.me/ | | Shaun Warman | https://shaunwarman.com/ |