compromise-numbers
v1.4.0
Published
plugin for nlp-compromise
Downloads
10,442
Readme
const nlp = require('compromise')
nlp.extend(require('compromise-numbers'))
let doc = nlp('I’d like to request seventeen dollars for a push broom rebristling')
doc.numbers().debug()
// 17
Demo
API
- .numbers() - grab all written and numeric values
- .numbers().get() - retrieve the parsed number(s)
- .numbers().json() - overloaded output with number metadata
- .numbers().fractions() - things like
1/3rd
- .numbers().toText() - convert number to
five
orfifth
- .numbers().toNumber() - convert number to
5
or5th
- .numbers().toOrdinal() - convert number to
fifth
or5th
- .numbers().toCardinal() - convert number to
five
or5
- .numbers().add(n) - increase number by n
- .numbers().subtract(n) - decrease number by n
- .numbers().increment() - increase number by 1
- .numbers().decrement() - decrease number by 1
- .numbers().isEqual(n) - return numbers with this value
- .numbers().greaterThan(min) - return numbers bigger than n
- .numbers().lessThan(max) - return numbers smaller than n
- .numbers().between(min, max) - return numbers between min and max
- .numbers().isOrdinal() - return only ordinal numbers
- .numbers().isCardinal() - return only cardinal numbers
- .numbers().toLocaleString() - add commas, or nicer formatting for numbers
- .numbers().normalize() - split-apart numbers and units
20mins
->20 mins
- .money() - like $5.50 or '5 euros'
- .money().get() - retrieve the parsed amount(s) of money
- .money().json() - currency + number info
- .money().currency() - which currency the money is in
- .fractions() - like '2/3rds' or 'one out of five'
- .fractions().get() - simple numerator, denomenator data
- .fractions().json() - json method overloaded with fractions data
- .fractions().toDecimal() - '2/3' -> '0.66'
- .fractions().normalize() - 'four out of 10' -> '4/10'
- .fractions().toText() - '4/10' -> 'four tenths'
- .fractions().toPercentage() - '4/10' -> '40%'
- .percentages() - like '2.5%'
- .fractions().get() - return the percentage number / 100
- .fractions().json() - json overloaded with percentage information
- .fractions().toFraction() - '80%' -> '8/10'
Opinions:
if a number is changed within a sentence, attempts are made at sentence-agreement - in both a leading determiner, and the plurality of a following noun. This is done safely, but it may have sneaky or unintended effects for some applications.
money, fractions, and percentages will be returned and work fine in .numbers()
, but can be isolated with .money()
, .fractions()
and .percentages()
Fractions
.fractions() will parse things like '1/3', 'one out of three', and 'one third'.
it will not pluck the fraction from the end of a number, like 'six and one third'. 'one third' will still have a #Fraction tag.
Things can get pretty crazy - and there are some human-ambiguous fractions like 'five hundred thousandths'. In these cases it tries its best.
Attempts are also made to avoid conversational fractions, like 'half time show' or dates like '3rd quarter 2020'.
Money
- ambiguous currencies: many currency symbols are re-used, for different countries. We try to make some safe assumptions about this. compromise-numbers assumes a naked
$
is USD,£
is GBP,₩
is South Korean, and'kr'
is Swedish Krona.
Configuring this should be possible in future versions.
- decimal currencies:
nlp('five cents').money().get(0)
will return0.05
(like it should), but.numbers().get()
will return5
. This is a tricky thing that we should solve, somehow.
Years and Time
times like 5pm
are parsed and handled by compromise-dates and are not returned by .numbers()
.
particularly, #Year
tags are applied to numbers in a delicate way.
Decimal seperators
compromise-numbers uses the period decimal point and supports comma as a thousands-seperator. Some european or latin-american number formats like comma-decimals, or space-separated-thousands do not parse properly.
Serial numbers
attempts are made to ignore phone-numbers, postal-codes and credit-card numbers from .numbers()
results, but there may be numbers used in other ways that are not accounted for.
work in progress!
MIT