npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

markovian-nlp

v7.0.4

Published

NLP tools generate Markov sentences & models.

Downloads

64

Readme

markovian-nlp

build status npm dependencies npm dev dependencies license npm bundle size (minified) npm bundle size (minified + gzip) node version compatibility npm current version

Quick start

As an isomorphic JavaScript package, there are multiple ways for clients, servers, and bundlers to start using this library. Several methods do not require installation.

RunKit

RunKit provides one of the least difficult ways to get started:

CodePen

Declare imports in the JS section to get started:

import {
  ngramsDistribution,
  sentences,
} from 'https://unpkg.com/markovian-nlp@latest?module';
const sentence = sentences({ document: 'oh me, oh my' });
console.log(sentence);
// example output: 'oh me oh me oh my'

Browsers

Insert the following element within the <head> tag of an HTML document:

<script src="https://unpkg.com/markovian-nlp@latest"></script>

After the script is loaded, the markovian browser global is exposed:

const sentence = markovian.sentences({ document: 'oh me, oh my' });
console.log(sentence);
// example output: ['oh me oh me oh my']

Node.js

With npm installed, run terminal command:

npm i markovian-nlp

Once installed, declare method imports at the top of each JavaScript file they will be used.

ES2015

Recommended

import {
  ngramsDistribution,
  sentences,
} from 'markovian-nlp';

CommonJS

const {
  ngramsDistribution,
  sentences,
} = require('markovian-nlp');

Usage

Markov text generation

Generate text sentences from a Markov process.

Potential applications: Natural language generation

Generate sentences

Optionally providing a seed generates deterministic sentences.

In this example, document is text from this source:

sentences({
  count: 3,
  document: 'That there is constant succession and flux of ideas in our minds...',
  seed: 1,
});

// output: [
//   'i would promote introduce a constant succession and hindering the path...',
//   'he that train they seem to be glad to be done as may be avoided of our thoughts...',
//   'this wandering of attention and yet for ought i know this wandering thoughts i would promote...',
// ]

View n-grams distribution

View the n-grams distribution of text.

Potential applications: Markov models

ngramsDistribution('birds have featured in culture and art since prehistoric times');

// output: {
//   and: { _end: 0, _start: 0, art: 1 },
//   art: { _end: 0, _start: 0, since: 1 },
//   birds: { _end: 0, _start: 1, have: 1 },
//   culture: { _end: 0, _start: 0, and: 1 },
//   featured: { _end: 0, _start: 0, in: 1 },
//   have: { _end: 0, _start: 0, featured: 1 },
//   in: { _end: 0, _start: 0, culture: 1 },
//   prehistoric: { _end: 0, _start: 0, times: 1 },
//   since: { _end: 0, _start: 0, prehistoric: 1 },
//   times: { _end: 1, _start: 0 },
// }

Each number represents the sum of occurrences.

startgram | endgram | bigrams --------- | ------- | ------- "birds" | "times" | all remaining keys ("have featured", "featured in", etc.)

API

ngramsDistribution(document || ngramsDistribution)

ngramsDistribution(Array(document || ngramsDistribution[, ...]))

Input

type | description ---- | ----------- String | document (corpus or text) Object | ngramsDistribution (equivalent to identity, i.e.: this method's output) Array[Strings...] | combine multiple document Array[Objects...] | combine multiple ngramsDistribution Array[Strings, Objects...] | combine multiple document and ngramsDistribution

Return value

type | description ---- | ----------- Object | distributions of unigrams to startgrams, endgrams, and following bigrams

// pseudocode signature representation (does not run)
ngramsDistribution(document) => ({
  ...unigrams: {
    ...{ ...bigram: bigramsDistribution },
    _end: endgramsDistribution,
    _start: startgramsDistribution,
  },
});

sentences({ distribution || document[, count][, seed] })

Input

user-defined parameter | type | optional | default value | implements | description ---------------------- | ---- | -------- | ------------- | ---------- | ----------- options.count | Number | true |1 | | Number of sentences to output. options.distribution | Object | required if options.document omitted | | | n-grams distribution used in place of text. options.document | String | required if options.distribution omitted | | compromise(document) | Text used in place of n-grams distribution. options.seed | Number | true | undefined | Chance(seed) | Leave undefined (default) for nondeterministic results, or specify seed for deterministic results.

Return value

type | description ---- | ----------- Array[Strings...] | generated sentences

Glossary

Learn more about computational linguistics and natural language processing (NLP) on Wikipedia.

The following terms are used in the API documentation:

term | description ---- | --- bigram | 2-gram sequence deterministic | repeatable, non-random endgram | final gram in a sequence n-gram | contiguous gram (word) sequence startgram | first gram in a sequence unigram | 1-gram sequence