@b08/tokenize
v1.0.2
Published
basic tokenize function
Downloads
113
Readme
@b08/tokenize, seeded from @b08/library-seed, library type: feature
Basic tokenize function used to split content string into known and unknown tokens Longer tokens are prioritized over short tokens
usage
Function tokenize returns objects with token and its position
import { tokenize } from "@b08/tokenize";
const result = tokenize("my content", ["con", " "]);
// [
// {token: "my", position: 0},
// {token: " ", position: 2},
// {token: "con", position: 3}
// {token: "tent", position: 6}
// ]
Function tokenizePlain returns array of strings
const result = tokenizePlain("some content", ["ten", "t", " "]);
// [ "some", " ", "con", "ten", "t" ];