@chirrapp/sbd
v2.1.0
Published
Split text into sentences using vanilla sentence boundary detection (SBD) algorithm
Downloads
7
Maintainers
Readme
Sentence boundary detection
The library is a fork of @Tessmore's sbd. Unlike the original version, the fork's focused on a single use case and removes extra options.
Split text into sentences with the vanilla strategy (i.e working ~95% of the time).
- Split a text based on period, question- and exclamation marks.
- Skips (most) abbreviations (Mr., Mrs., PhD.)
- Skips numbers/currency.
- Skips urls, websites, email addresses, phone nr.
- Counts ellipsis and ?! as single punctuation.
Installation
The library is available as an npm package published at the GitHub Registry. To install @chirrapp/sbd run:
npm install @chirrapp/sbd --save
# Or using Yarn:
yarn add @chirrapp/sbd
Using
import { sentences } from "@chirrapp/sbd";
sentences(
"On Jan. 20, former Sen. Barack Obama became the 44th President of the U.S. Millions attended the Inauguration."
);
//=> [
//=> "On Jan. 20, former Sen. Barack Obama became the 44th President of the U.S.",
//=> "Millions attended the Inauguration.",
//=> ]