npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

markov-strings

v3.0.1

Published

A Markov string generator

Downloads

1,406

Readme

Build Status Coverage Status npm version dep


! This is the readme for markov-strings 3.x.x. - The docs for the older 2.x.x are here !


Markov-strings

A simplistic Markov chain text generator. Give it an array of strings, and it will output a randomly generated string.

This module was created for the Twitter bot @BelgicaNews.

Prerequisites

Built and tested with NodeJS 12

Installing

npm install --save markov-strings

Usage

const Markov = require('markov-strings').default
// or
import Markov from 'markov-strings'

const data = [/* insert a few hundreds/thousands sentences here */]

// Build the Markov generator
const markov = new Markov({ stateSize: 2 })

// Add data for the generator
markov.addData(data)

const options = {
  maxTries: 20, // Give up if I don't have a sentence after 20 tries (default is 10)

  // If you want to get seeded results, you can provide an external PRNG.
  prng: Math.random, // Default value if left empty

  // You'll often need to manually filter raw results to get something that fits your needs.
  filter: (result) => {
    return result.string.split(' ').length >= 5 && // At least 5 words
           result.string.endsWith('.')             // End sentences with a dot.
  }
}

// Generate a sentence
const result = markov.generate(options)
console.log(result)
/*
{
  string: 'lorem ipsum dolor sit amet etc.',
  score: 42,
  tries: 5,
  refs: [ an array of objects ]
}
*/

Markov-strings is built in TypeScript, and exports several types to help you. Take a look at the source to see how it works.

API

new Markov([options])

Create a generator instance.

options

{
  stateSize: number
}

The stateSize is the number of words for each "link" of the generated sentence. 1 will output gibberish sentences without much sense. 2 is a sensible default for most cases. 3 and more can create good sentences if you have a corpus that allows it.

.addData(data)

To function correctly, the Markov generator needs its internal data to be correctly structured. .addData(data) allows you add raw data, that is automatically formatted to fit the internal structure.

You can call .addData(data) as often as you need, with new data each time (!). Multiple calls of .addData() with the same data is not recommended, because it will skew the random generation of results.

data

string[] | Array<{ string: string }>

data is an array of strings (sentences), or an array of objects. If you wish to use objects, each one must have a string attribute. The bigger the array, the better and more various the results.

Examples:

[ 'lorem ipsum', 'dolor sit amet' ]

or

[
  { string: 'lorem ipsum', attr: 'value' },
  { string: 'dolor sit amet', attr: 'other value' }
]

The additionnal data passed with objects will be returned in the refs array of the generated sentence.

.generate([options])

Returns an object of type MarkovResult:

{
  string: string, // The resulting sentence
  score: number,  // A relative "score" based on the number of possible permutations. Higher is "better", but the actual value depends on your corpus
  refs: Array<{ string: string }>, // The array of references used to build the sentence
  tries: number   // The number of tries it took to output this result
}

The refs array will contain all objects that have been used to build the sentence. May be useful to fetch meta data or make stats.

Since .generate() can potentially take several seconds or more, a non-blocking variant .generateAsync() is conveniently available if you need it.

options

{
  maxTries: number // The max number of tentatives before giving up (default is 10)
  prng: Math.random, // An external Pseudo Random Number Generator if you want to get seeded results
  filter: (result: MarkovResult) => boolean // A callback to filter results (see example above)
}

.export() and .import(data)

You can export and import the markov built corpus. The exported data is a serializable object, and must be deserialized before being re-imported.

Example use-case

Changelog

3.0.0

Refactoring to facilitate iterative construction of the corpus (multiple .addData() instead of a one-time buildCorpus()), and export/import of corpus internal data.

2.1.0

  • Add an optionnal prng parameter at generation to use a specific Pseudo Random Number Generator

2.0.4

  • Dependencies update

2.0.0

  • Refactoring with breaking changes
  • The constructor and generator take two different options objects
  • Most of generator options are gone, except filter and maxTries
  • Tests have been rewritten with jest, in TypeScript

1.5.0

  • Code rewritten in TypeScript. You can now import MarkovGenerator from 'markov-strings'

1.4.0

  • New filter() method, thanks @flpvsk

1.3.4 - 1.3.5

  • Dependencies update

1.3.3

  • Updated README. Version bump for npm

1.3.2

  • Fixed an infinite loop bug
  • Performance improvement

1.3.1

  • Updated README example
  • Removed a useless line

1.3.0

  • New feature: the generator now accepts arrays of objects, and tells the user which objects were used to build a sentence
  • Fixed all unit tests
  • Added a changelog

Running the tests

npm test