@datafire/geneea
v6.0.0
Published
DataFire integration for Geneea Natural Language Processing
Downloads
48
Readme
@datafire/geneea
Client library for Geneea Natural Language Processing
Installation and Usage
npm install --save @datafire/geneea
let geneea = require('@datafire/geneea').create({
user_key: ""
});
.then(data => {
console.log(data);
});
Description
<h2>API operations</h2>
<p>
All API operations can perform analysis on supplied raw text or on text extracted from a given URL.
Optionally, one can supply additional information which can make the result more precise. An example
of such information would be the language of text or a particular text extractor for URL resources.
</p>
<p>The supported types of analyses are:</p>
<ul>
<li><strong>lemmatization</strong> ⟶
Finds out lemmata (basic forms) of all the words in the document.
</li>
<li><strong>correction</strong> ⟶
Performs correction (diacritization) on all the words in the document.
</li>
<li><strong>topic detection</strong> ⟶
Determines a topic of the document, e.g. finance or sports.
</li>
<li><strong>sentiment analysis</strong> ⟶
Determines a sentiment of the document, i.e. how positive or negative the document is.
</li>
<li><strong>named entity recognition</strong> ⟶
Finds named entities (like person, location, date etc.) mentioned the the document.
</li>
</ul>
<h2>Encoding</h2>
<p>The supplied text is expected to be in UTF-8 encoding, this is especially important for non-english texts.</p>
<h2>Returned values</h2>
<p>The API calls always return objects in serialized JSON format in UTF-8 encoding.</p>
<p>
If any error occurs, the HTTP response code will be in the range <code>4xx</code> (client-side error) or
<code>5xx</code> (server-side error). In this situation, the body of the response will contain information
about the error in JSON format, with <code>exception</code> and <code>message</code> values.
</p>
<h2>URL limitations</h2>
<p>
All the requests are semantically <code>GET</code>. However, for longer texts, you may run into issues
with URL length limit. Therefore, it's possible to always issue a <code>POST</code> request with all
the parameters encoded as a JSON in the request body.
</p>
<p>Example:</p>
<pre><code>
POST /s1/sentiment
Content-Type: application/json
{"text":"There is no harm in being sometimes wrong - especially if one is promptly found out."}
</code></pre>
<p>This is equivalent to <code>GET /s1/sentiment?text=There%20is%20no%20harm...</code></p>
<h2>Request limitations</h2>
<p>
The API has other limitations concerning the size of the HTTP requests. The maximum allowed size of any
POST request body is <em>512 KiB</em>. For request with a URL resource, the maximum allowed number of
extracted characters from each such resource is <em>100,000</em>.
</p>
<h2>Terms of Service</h2>
<p>
By using the API, you agree to our
<a href="https://www.geneea.com/terms.html" target="_blank">Terms of Service Agreement</a>.
</p>
<h2>More information</h2>
<p>
<a href="https://help.geneea.com/index.html" target="_blank">
The Interpretor Public Documentation
</a>
</p>
Actions
getInfo
getInfo
geneea.getInfo(null, context)
Input
This action has no parameters
Output
correctionGet
Possible options:An optional parameter diacritize with values yes, no or auto indicate whether the text diacritization will be performed. The default value is auto.
geneea.correctionGet({}, context)
Input
- input
object
- id
string
: document ID - text
string
: raw document text - url
string
: document URL - extractor
string
(values: default, article, keep-everything): document extractor - language
string
: document language - returnTextInfo
boolean
- id
Output
correctionPost
Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}Possible options:An optional parameter diacritize with values yes, no or auto indicate whether the text diacritization will be performed. The default value is auto.
geneea.correctionPost({}, context)
Input
- input
object
- body Request
Output
entitiesGet
entitiesGet
geneea.entitiesGet({}, context)
Input
- input
object
- id
string
: document ID - text
string
: raw document text - url
string
: document URL - extractor
string
(values: default, article, keep-everything): document extractor - language
string
: document language - returnTextInfo
boolean
- id
Output
- output EntitiesResponse
entitiesPost
Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}
geneea.entitiesPost({}, context)
Input
- input
object
- body Request
Output
- output EntitiesResponse
lemmatizeGet
lemmatizeGet
geneea.lemmatizeGet({}, context)
Input
- input
object
- id
string
: document ID - text
string
: raw document text - url
string
: document URL - extractor
string
(values: default, article, keep-everything): document extractor - language
string
: document language - returnTextInfo
boolean
- id
Output
- output LemmatizeResponse
lemmatizePost
Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}
geneea.lemmatizePost({}, context)
Input
- input
object
- body Request
Output
- output LemmatizeResponse
sentimentGet
sentimentGet
geneea.sentimentGet({}, context)
Input
- input
object
- id
string
: document ID - text
string
: raw document text - url
string
: document URL - extractor
string
(values: default, article, keep-everything): document extractor - language
string
: document language - returnTextInfo
boolean
- id
Output
- output SentimentResponse
sentimentPost
Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}
geneea.sentimentPost({}, context)
Input
- input
object
- body Request
Output
- output SentimentResponse
topicGet
topicGet
geneea.topicGet({}, context)
Input
- input
object
- id
string
: document ID - text
string
: raw document text - url
string
: document URL - extractor
string
(values: default, article, keep-everything): document extractor - language
string
: document language - returnTextInfo
boolean
- id
Output
- output TopicResponse
topicPost
Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}
geneea.topicPost({}, context)
Input
- input
object
- body Request
Output
- output TopicResponse
status
status
geneea.status(null, context)
Input
This action has no parameters
Output
- output
string
Definitions
EntitiesResponse
- EntitiesResponse
object
: Response for the named-entity recognition- entities required
array
: Found named entities in the document- items Entity
- id
string
: Unique identifier of the document - language required
string
: The used language of the document - text
string
: The raw text of the document which has been analysed
- entities required
Entity
- Entity
object
: The named entity- entity required
string
: Disambiguated and standardized form of the entity - links required
object
: Disambiguation links for the entity, e.g. its DBpedia page - sentiment
number
: Detected sentiment of the entity (value from -1.0 to 1.0) - textOffset required
integer
: Character offset in the text (starting from 0) - type required
string
: Detected type of the entity
- entity required
Entry«string,long»
- Entry«string,long»
object
- key
integer
- key
Information about a user account.
Information_about_a_user_account.
- Information_about_a_user_account.
object
- remainingQuotas
array
: Remaining quotas for the user account.- items Entry«string,long»
- type
string
: Type (plan) of the user account.
- remainingQuotas
Label
- Label
object
: The topic label- confidence required
number
: Confidence (probability) of this label - label required
string
: The value of this label
- confidence required
LemmatizeResponse
- LemmatizeResponse
object
: Response for the lemmatization- id
string
: Unique identifier of the document - language required
string
: The used language of the document - lemmatizedText required
string
: Lemmatized text of the document, individual tokens are separated by a space and sentences are separated by a new-line character - text
string
: The raw text of the document which has been analysed
- id
Request
- Request
object
: Request encapsulation for simple API version 1- extractor
string
(values: default, article, keep-everything): [optional] Text extractor to be used when analyzing HTML document - id
string
: Unique identifier of the document, it's optional - language
string
: [optional] The language of the document, auto-detection will be used if omitted - options
object
: [optional] Additional options for the internal modules (key-value pairs) - returnTextInfo
boolean
: [optional] Indicates whether to return the source text within the response object - text
string
: The raw text to be analyzed, mutually exclusive with the 'url' parameter - url
string
: URL of a document to be analysed, mutually exclusive with the 'text' parameter
- extractor
Response for the text correction
Response_for_the_text_correction
- Response_for_the_text_correction
object
- corrected
boolean
- correctedText required
string
: Corrected text of the document - diacritized
boolean
- id
string
: Unique identifier of the document - language required
string
: The used language of the document - text
string
: The raw text of the document which has been analysed
- corrected
SentimentResponse
- SentimentResponse
object
: Response for the sentiment analysis- id
string
: Unique identifier of the document - language required
string
: The used language of the document - sentiment required
number
: Detected sentiment of the document (value from -1.0 to 1.0) - text
string
: The raw text of the document which has been analysed
- id
TopicResponse
- TopicResponse
object
: Response for the topic detection- confidence required
number
: Confidence for the detected topic - id
string
: Unique identifier of the document - labels required
array
: Probabilistic distribution over possible topic labels- items Label
- language required
string
: The used language of the document - text
string
: The raw text of the document which has been analysed - topic required
string
: Detected topic of the document
- confidence required