Barfer
v1.2.4
Published
This module provides a set of NLP tools
Downloads
35
Maintainers
Readme
Barfer
This module provides a set of NLP tools, using other modules, to find various things:
- Sentiment (English and Spanish) using trigrams and bigrams
- Emoji sentiment
- Intention
- Topics
- Context
- Discover language (defaults to a white-list of eng, spa, por, fra and ger)
Changes
* 1.2.4 - Improved stopwords and sentiment, general system speed improvement
* 1.2.0 - Added more default taggers
* 1.1.1 - Fixing some bugs
* 1.1.0 - Overhaul to the core, removed a couple of modules, simplified logic, and, optimized parsing
How to use
Start Barfer
const barfer = new Barfer( {
lang:
{
whitelist: [ 'spa' ] // works best when focused in one single language for now...
// whitelist: [ 'eng', 'spa' ]
},
} );
Additional configuration options
const conf = {
// process data as Twitter data
twitter: true,
// enable morphing into ascii only characters.
latinize: true,
// define a very important target/topic
target: 'some name',
// add interesting topics for the tagger.
interesting: [ 'some', 'other' ],
// set this to an empty array if you want to surpass the white-list
lang:
{
whitelist: [ 'eng', 'spa', 'por', 'fra', 'ger' ]
},
};
Implement a term
// this term is a default in Barfer
barfer.addParameter( ( tokens ) => {
const match = /^(car|bus|metro|train|plane|boat|taxi|bike|bicicle)\b/igm.exec(tokens.rest());
if ( match !== null )
{
return {
tag: 'vehicles',
length: match[ 0 ].length,
data: match[ 0 ].toLowerCase()
};
}
} );
Study
const data = barfer.study( twit, ( err, data ) => {
// data is full of rich data!
} );
Output
{
str: 'rt @jonbershad: @realdonaldtrump fun. so you won\'t be giving us a date when you\'ll be discussing your massive conflicts of interest?',
lang: 'eng',
topics:
[ { count: 135,
length: 10,
stem: 'discuss',
text: 'discussing',
weight: 9.64,
action: true,
topic: true },
{ count: 134,
length: 6,
stem: 'give',
text: 'giving',
weight: 9.57,
stopword: true,
action: true,
topic: true },
{ count: 59,
length: 9,
stem: 'conflict',
text: 'conflicts',
weight: 4.21,
topic: true,
sentiment: -2,
negative: true },
{ count: 35,
length: 7,
stem: 'massiv',
text: 'massive',
weight: 2.5,
topic: true },
{ count: 34,
length: 6,
stem: 'youll',
text: 'youll',
weight: 2.42,
stopword: true,
topic: true },
{ count: 29,
length: 9,
stem: 'interest',
text: 'interest',
weight: 2.07,
stopword: true,
topic: true,
sentiment: 1,
positive: true },
[length]: 6 ],
tagger:
{
actions:
{
tag: 'actions',
words:
{
giving: { text: 'giving', count: 1, action: true },
discussing: { text: 'discussing', count: 1, action: true }
}
},
topics:
{
tag: 'topics',
words:
{
giving: { text: 'giving', data: { index: 0 } },
youll: { text: 'youll', data: { index: 0 } },
discussing: { text: 'discussing', data: { index: 0 } },
massive: { text: 'massive', data: { index: 0 } },
conflicts: { text: 'conflicts', data: { index: 1 } },
interest: { text: 'interest', data: { index: 2 } }
}
},
positive:
{
tag: 'positive',
words:
{
interest:
{
text: 'interest',
data: [ 'massive', 'conflicts', [length]: 2 ] }
}
},
negative:
{ tag: 'negative',
words:
{
conflicts:
{
text: 'conflicts',
data: [ 'massive', 'interest', [length]: 2 ] } }
}
},
rest: [ 'discussing', 'massive', [length]: 2 ],
sentiment:
{
polarity: -1,
positive: { score: 1, words: [ 'interest', [length]: 1 ] },
negative: { score: -2, words: [ 'conflicts', [length]: 1 ] } },
emojiSentiment:
{
polarity: 0,
positive: { score: 0, emoji: [ [length]: 0 ] },
negative: { score: 0, emoji: [ [length]: 0 ] }
},
twitter:
{
parsedAt: 1481743750671,
mentions: [ 'jonbershad', 'realdonaldtrump', [length]: 2 ],
hashtags: [ [length]: 0 ],
cashtags: [ [length]: 0 ],
replies: [ [length]: 0 ],
urls: [ [length]: 0 ]
},
wordMap:
{ '@jonbershad':
{ count: 5,
length: 12,
stem: '@jonbershad',
text: '@jonbershad',
weight: 0.35,
rest: true,
mention: true },
'@realdonaldtrump':
{ count: 5,
length: 16,
stem: '@realdonaldtrump',
text: '@realdonaldtrump',
weight: 0.35,
rest: true,
mention: true },
fun: { count: 1, length: 4, stem: 'fun', text: 'fun', weight: 0.07 },
you: { count: 1, length: 3, stem: 'you', text: 'you', weight: 0.07 },
wont:
{ count: 1,
length: 5,
stem: 'wont',
text: 'wont',
weight: 0.07,
stopword: true },
giving:
{ count: 134,
length: 6,
stem: 'give',
text: 'giving',
weight: 9.57,
stopword: true,
action: true,
topic: true },
date:
{ count: 1,
length: 4,
stem: 'date',
text: 'date',
weight: 0.07,
stopword: true },
when:
{ count: 1,
length: 4,
stem: 'when',
text: 'when',
weight: 0.07,
stopword: true },
youll:
{ count: 34,
length: 6,
stem: 'youll',
text: 'youll',
weight: 2.42,
stopword: true,
topic: true },
discussing:
{ count: 135,
length: 10,
stem: 'discuss',
text: 'discussing',
weight: 9.64,
action: true,
topic: true },
your:
{ count: 1,
length: 4,
stem: 'your',
text: 'your',
weight: 0.07,
stopword: true },
massive:
{ count: 35,
length: 7,
stem: 'massiv',
text: 'massive',
weight: 2.5,
topic: true },
conflicts:
{ count: 59,
length: 9,
stem: 'conflict',
text: 'conflicts',
weight: 4.21,
topic: true,
sentiment: -2,
negative: true },
interest:
{ count: 29,
length: 9,
stem: 'interest',
text: 'interest',
weight: 2.07,
stopword: true,
topic: true,
sentiment: 1,
positive: true }
}
}
Note
This is proof of concept still, but I'm working regularly in improving it.
See test/index.js
for more example in how to use Barfer.
Tests
Run VERBOSE=true npm test
to run tests and see all data output
And npm test
just to run the tests.
License
See LICENSE for license info