speech-builder

v2.2.0

Published

3 years ago

Utility to generate SSML markup for different voice platforms

Downloads

0High
0Medium
0Low

fgnass

ssml voice alexa

speech-builder

Utility to build SSML documents for different voice platforms.

Motivation / Approach

Since the major voice platforms all support slightly different subsets of SSML, documents built for one platform often cause errors when used somewhere else.

To address this issue, speech-builder accepts a list of feature flags as configuration option.

Basic Usage

const { ssml } = require('speech-builder');

const s = ssml().add('Hello').emphasis('world');

console.log(s.toString());

<speak>Hello<emphasis>world</emphasis></speak>

Advanced Options

Base URI

SSML allows authors to specify URLs as relative paths which get resolved according to the xml:base attribute. This is especially useful for audio files which need to be encoded differently for each platform. On platforms that don't support xml:base speech-builder omits the attribute and performs the URL resolving itself.

ssml({
  features: 'alexa',
  base: 'https://example.com/audio/16k'
});

Lexicon

Speech sythesizers sometimes fail to pronounce words correctly, especially when mixing multiple languages. Using .phoneme() helps but can get quite cumbersome to write. As an alternative you can provide a lexicon:

ssml({
  features: 'alexa',
  lexicon: {
    "Passquote": "paskvoːtə",
    "Lloris Hugo": { 
      ipa: "lɔʹris yˌgo", 
      sub: "Lohrieß Ügo"
    }
  }
});

Plain Text Output

The speech-builder can not only output SSML markup but also a plain text representation with <p> tags converted into line breaks. This is useful for adding some basic formatting for visual surfaces.

ssml().p('Hello').p('world!').toPlainText();

API

lang

Adds an xml:lang attribute to the current node. If not supported, this is a no-op.

Parameters

lang string?

addText

Adds text. Characters with special meaning in XML are properly escaped.

Parameters

text (string | number)

addToken

Like addText but prepends a space (unless it's the first token or the previous one alreads ends with whitespace).

Parameters

text (string | number)

add

Like addToken but also accepts a function or an array.

Parameters

content any

sub

Adds a <sub> tag. If substitions are not supported, the alias is added as text instead.

Parameters

text string
alias string?

phoneme

Adds a <phoneme> tag. When an object with notations in different alphabets is passed as ph, the first one that is supported will be used. For platforms without phoneme support, the special sub alphabet can be used to generate a <sub> tag as fallback.

Parameters

text string
ph (string? | {})

break

Adds a <break> tag. If not supported, this is a no-op.

Parameters

time (string? | number)

audio

Adds an <audio> tag. If not supported, the alt text is added as plain text.

Parameters

src (string | {src: string, alt: string?})

emphasis

Adds an <emphasis> tag. If not supported, the text is added as-is.

Parameters

content any
level string?

p

Adds a <p> tag. If not supported, the text is added as-is.

Parameters

content any

s

Adds an <s> tag. If not supported, the text is added as-is.

Parameters

content any

w

Adds a <w> tag. If not supported, the text is added as-is.

Parameters

role string
content any

effect

Adds an <*:effect> tag. If not supported, the text is added as-is. NOTE: The namespace can be configured via the effect feature setting.

Parameters

name string
content any

sayAs

Adds an <say-as> tag. If not supported, the text is added as-is.

Parameters

interpretAs string
text (string | number)
format string?
detail (string? | number)

prosody

Adds a <prosody> tag. If not supported, the text is added as-is.

Parameters

attrs Object
text any

toString

Returns the serialized SSML document.

Returns string

toPlainText

Returns the document without any markup. Paragraphs are turned into line breaks.

Returns string

replace

Duck-type as string to support the Jovo framework.

Parameters

pattern (string | RegExp)
replacement (string | Function)

Adding variations

const { ssml } = require('speech-builder');
const { random, chance } = require('variation');

ssml()
  .add(random('hello', 'ciao', 'hola', 'salut'))
  .add(chance(0.5, 'beautiful'))
  .add('world');

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

speech-builder

Motivation / Approach

Basic Usage

Advanced Options

Base URI

Lexicon

Plain Text Output

API

Table of Contents

lang

addText

addToken

add

sub

phoneme

break

audio

emphasis

p

s

w

effect

sayAs

prosody

toString

toPlainText

replace

Adding variations

License