public-eye
v0.3.3
Published
annotator for plain texts using various dbpedia related services
Downloads
13
Maintainers
Readme
Public-Eye
A lot of named entity disambiguation services, like dpedia spotlight, are now available on the web. They all expose a solid REST api and they all disambiguate on top of DBpedia resources. They have different output formats, though, and this is where Public-Eye comes in handy. Public-Eye is a tiny open source library that aims to harmonize the different annotation results and gives you access to language detection automatically thanks to the awesome languagedetect library.
// minimalistic example with spotlight
var publicEye = require('public-eye')();
var text = 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.';
publicEye.spotlight({
text: text
}, (err, response) => {
// ... response.Resources gives you a list of
// {
// ...
// Resources: [
// {
// "@URI": "http://dbpedia.org/resource/German_reunification",
// "@support": "1989",
// "@types": "",
// "@surfaceForm": "German reunification",
// "@offset": "449",
// "@similarityScore": "0.9999997861474641",
// "@percentageOfSecondRank": "1.5374345655254399E-7"
// }
// ]
// }
});
This tiny library gives you easy access to a number of named entity disambiguation services, including dpedia spotlight, Babelfy and Textrazor. We have just added a basic service for local stanfordNER via the ner node library. The public-eye "mapping service" translates each proprietary format to a common format; this means you can annotate text using multiple services.
Installation
npm install public-eye --save
simple examples and configuration hints
More example are provided in the /test
folder.
The very first thing to do is to require the library and correctly set apikeys provided by the different services:
var publicEye = require('public-eye')({
services: {
textrazor: {
apiKey: 'your-api-key'
},
babelfy: {
key: 'your-babelfy-api-key'
}
}
});
Once the library is available to your script, the easiset way to uniform and harmonize different services on the same text is using the series
method:
publicEye.series({
services:[
'textrazor',
'babelfy'
],
text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45)'
}, function(err, response){
// ... response.entities
})
The usage type for textrazor entity disambiguation:
var publicEye = require('public-eye')({
services: {
textrazor: {
apiKey: 'your-api-key'
}
}
});
// ..
publicEye.textrazor({
text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.'
}, function(err, response){
// ...
// your callback here, response.entities is the list of entities with startingPos and endingPos
})
Usage type for babelfy:
var publicEye = require('public-eye')({
services: {
babelfy: {
key: 'your-api-key'
}
}
});
// ..
publicEye.babelfy({
text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.'
}, function(err, response){
// ...
// your callback here, response is the list of entities with startingPos and endingPos
})
Usage type for StanfordNER, cfr. ner node library documentation:
var publicEye = require('public-eye')({
services: {
stanfordNER: {
port: 9191,
host: 'localhost'
}
}
});
publicEye.stanfordNER({
text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.'
}, function(err, body){
console.log(res.entities);
// => { LOCATION:
// [ 'Berlin',
// 'Prussia',
// 'Weimar',
// 'Berlin',
// 'East Berlin',
// 'East Germany',
// 'West Berlin',
// 'Berlin',
// 'Germany' ],
// ORGANIZATION: [],
// DATE:
// [ '13th century',
// '1918',
// '1918',
// '1919',
// '1933',
// '1920s',
// '1961',
// '1990' ],
// MONEY: [],
// PERSON: [ 'Reich' ],
// PERCENT: [],
// TIME: [] }
});
Usage type for geonames search:
var publicEye = require('public-eye')({
services: {
geonames: {
username: 'your-username'
}
}
});
publicEye.geonames({
text: 'Osh' // a city in Kyrgyzstan
}, function(err, body){
// console.log(body.geonames)
// => [
// { adminCode1: '08',
// lng: '72.7985',
// geonameId: 1527534,
// toponymName: 'Osh',
// countryId: '1527747',
// fcl: 'P',
// population: 200000,
// countryCode: 'KG',
// name: 'Osh',
// fclName: 'city, village,...',
// countryName: 'Kyrgyzstan',
// fcodeName: 'seat of a first-order administrative division',
// adminName1: 'Osh',
// lat: '40.52828',
// fcode: 'PPLA'
// },
// ...
// ]
});
More services to come, stay tuned!