@lautmaler/platform-audiocodes

v1.4.7

Published

5 months ago

Voice.AI gateway integration

Downloads

0High
0Medium
0Low

lautmaler

Jovo plugin library for AudioCodes VoiceAI Connect

This library allows you to build voice apps for the AudioCodes VoiceAI Connect platform with the Jovo Framework. The Audiocodes API documentation is available here

Usage

For context, a typical conversation flow is documented here. As a matter of implementation, Audiocodes-side requests are handled by AudiocodesRequest.ts which maps them to JovoInput objects. Handlers are responsible for processing the input and generating Output objects. This library provides implementations for several Output types in src/output. Finally, AudiocodesPlatformOutputTemplateConverterStrategy maps the output to an AudiocodesResponse and sent back to the Audiocodes platform as JSON.

Your Jovo App needs to use a DBPlugin to store Jovos dialogue state and session data in between requests. The Audiocodes protocol does not allow to send this data within the requests and responses as Jovo would do normally.

Supported Output types

src/output contains all supported Audiocodes output implementations:

MessageActivityOutput: a message type output, used to send a voice prompt to the user.
SessionConfigOutput: configures bot delay options doc
EndConversationOutput: end call doc
CallTransferOutput: transfer call to other number doc
AbortPromptOutput: abort currently playing prompts doc
PlayUrlActivityOutput: play audio file to caller doc

Using Output types

An intent handler that return a bot response would look like this:

import { BaseComponent } from '@jovotech/framework';

class Component extends BaseComponent {
  @Intents('HelpIntent')
  helpHandler() {
    return this.$send(MessageActivityOutput, {
      message: '<speak> I can demo you some features.</speak>',
    });
  }
}

Themessage attribute of MessageActivityOutput config is a (SSML enabled) string which is sent to the user as a voice prompt.

Below is a more complex example where the call is transferred to a human operator. There are two activities sent to the user: a voice prompt using MessageActivityOutput and a call transfer request using CallTransferOutput. The latter has extra activityParams

class Component extends BaseComponent {
  @Intents(['HumanOperatorIntent'])
  humanOperator() {
    this.$send(MessageActivityOutput, {
      message: 'Ok. I will transfer you to a human.',
    });
    this.$send(CallTransferOutput, {
      activityParams: {
        transferTarget: 'sip:some-sip-target-number',
        handoverReason: 'User requests to talk to a human',
      },
    });
    return this.$redirect(CallTransferComponent);
  }
}

Session Parameter Plugin

The AudiocodesPlatform automatically use the AudiocodesSessionPlugin.
The AudiocodesSessionPlugin ensures that a sensible set of session parameters gets set upon starting a call (notions session parameters and call settings are used interchangeably here). It also provides the $updateCallSettings method on the jovo object that lets you update session parameters explicitly.

At the start of the call the plugin will send a set of session parameters to the Audicodes backend.
The Platform config expects you to hand in a subset of mandatory session parameters under sessionConfig (they have no plausible defaults, so you have to decide). You can also can set non mandatory session parameters under the sessionConfig property.

const app = new App({
  plugins: [
    new AudiocodesPlatform({
      fallbackLocale: 'de-DE',
      sessionConfig: {
        language: 'de-DE',
        voiceName: 'de-DE-elkeNeural',
        alternativeLanguages: [
          { language: 'en-US', voiceName: 'en-US-GuyNeural' },
          { language: 'de-DE', voiceName: 'de-DE-KatjaNeural' },
        ],
      },
    }),
  ],
});

The given sessionConfig overwrites values of the default config, or adds them if not present.
The default config defines the following parameters

{
  "botNoInputSpeech": "Einen Moment Bitte.",
  "userNoInputTimeoutMS": 8000,
  "userNoInputRetries": 3,
  "userNoInputGiveUpTimeoutMS": 60000,
  "enableSpeechInput": true,
  "bargeIn": true,
  "languageDetectionActivate": true,
  "botNoInputRetries": 2,
  "sendEventsToBot": [
    "VoiceAiNotificationEvent.DIALOUT_INITIATED",
    "VoiceAiNotificationEvent.NO_USER_INPUT",
    "VoiceAiNotificationEvent.CONVERSATION_END"
  ],
  "notifyParamChange": ["language"],
  "sendDTMF": true,
  "sendDTMFAsText": false,
  "bargeInOnDTMF": true,
  "dtmfCollect": true,
  "dtmfCollectSubmitDigit": "#",
  "dtmfCollectInterDigitTimeoutMS": 2000,
  "dtmfCollectMaxDigits": 0
}

The available parameters will be described in the following.

Please only use the this.updateCallSettings(callSettings) method to update session parameters. The underlying mechanism merges the current session parameters with the update and

syncs it with the data in this.$session at the end of the jovo request cycle
adds a session config output as the first element in jovo.$output array. This way the new parameters are already in action when the rest of the bots outputs reach the audiocodes backend.

And an example of updating sessionParams in a handler:

class Component extends BaseComponent {
  @Handle()
  welcome() {
    return this.$updateCallSettings({
      language: 'de-DE',
      voiceName: 'new-voice',
    });
  }
}

Configurable session and activity parameters

Certain session and activity parameters are configurable by controlling attribute values of Output classes. (see Output Types section above for a list of supported output types). A configuration change can be done for the entire call duration via the this.updateCallSettings(callSettings) method (within one of your handlers) that manipulates this.$session.sessionParams and takes care of updating the Audiocodes backend.
You can also change the configuration just for the next turn of the dialogue by setting activityParams of the output classes your are using in the previous turn for the bot response.

activityParams and sessionParams CallSettingsParams is the data holder for such common attributes and it can be used to set activityParams and/or sessionParams. Here is the exhaustive list of these attributes

language and voiceName (depending on the speech provider) can be set for each message individually. See API doc.
sttSpeechContexts providing hints for improving the accuracy of speech recognitions. See API doc.
sendEventsToBot if specified, sends notification events to the bot. See API doc. By default no events are sent to the bot
enableSpeechInput Enables the activation of the speech-to-text service after the bot message playback is finished. Defaults to true
bargeIn allow user to interrupt when the bot plays a message to the user. default is true
languageDetectionActivate: Activates language detection. Defaults to true.
alternativeLanguages: Defines a list of up to three alternative languages (in addition to the current language) that will be used to detect the language spoken. By default English and German are configured.

DTMF attributes. If sendDTMF is not set, the defaults below are used. Several attributes have defaults in place so they can be omitted if that default value is desired.

sendDTMF: enables collecting DTMF. default is true.
sendDTMFAsText boolean default is false
bargeInOnDTMF boolean default is true
dtmfCollect if false, each DTMF digit is sent on a separate event to the bot. default is true
dtmfCollectInterDigitTimeoutMS: how long to wait before sending all digits to the bot. Default is 2000
dtmfCollectMaxDigits max number of digits collected. Default is 0 (no limit)
dtmfCollectSubmitDigit Defines a special DTMF "submit" digit that when received from the user, VoiceAI sends all the collected digits to the bot. default is #.To disable set to empty string.

Bot delay attributes

botNoInputTimeoutMS: timeout before a prompt is played to the user in case the bot takes more time to process the input . Default is not set.
botNoInputSpeech: textual prompt to play to the user when no input is received from the bot.
botNoInputRetries: maximum number of recurring timeouts for bot response. Default is not set.
botNoInputGiveUpTimeoutMS: timeout response before the call is disconnected. Default is not set.
botNoInputUrl: Audio file to play when the bot does not respond.
resumeRecognitionTimeoutMS: When Barge-In is disabled, speech input is not expected before the bot's response has finished playback. If no reply from the bot arrives within this configured timeout (in milliseconds), VoiceAI Connect expects speech input from the user and speech-to-text recognition is re-activated.

User delay attributes userNoInputTimeoutMS: Maximum time in ms to wait for user input. Defaults to 2000 userNoInputGiveUpTimeoutMS: Maximum time in ms to wait for user input after which the call is terminated. Defaults to 15000 userNoInputRetries: Maximum number of retries to wait for user input. Defaults to 1 or two attempts.

Play URL attributes Attributes configurable via PlayUrlActivityOutput:

playUrlUrl: URL of where the audio file is located.
playUrlMediaFormat: format of the pre-recorded audio file.
playUrlAltText: text to display in the transcript page of the user interface while the pre-recorded audio is played.
playUrlFailureHandling: behaviour in case of failing to play audio: disconnect, notofy-bot and ignore are supported. Use InputType.PLAY_URL_FAILED to handle the event.
playUrlCaching boolean. If true, the audio file is cached on the VoiceAI Connect server. Defaults to true.

Example setting of activityParams when sending a message activity:

class Component extends BaseComponent {
  @Intents(['HumanOperatorIntent'])
  humanOperator() {
    this.$send(MessageActivityOutput, {
      message: 'Frag wie das Wetter ist.',
      activityParams: {
        sttSpeechContexts: [
          {
            phrases: ['Wetter'],
            boost: 20,
          },
          {
            phrases: ['$ADDRESSNUM'],
          },
        ],
        sendEventsToBot: ['noUserInput'],
      },
    });
  }
}

Some activities allow for setting activityParams specific only to them. These are:

handoverReason for call transfers. API doc:
hangupReason when ending a conversation. API doc: There are ActivityParam structures defined for each such activity. They all inherit from CallSettingsParams which allows (re)setting all other common attributes as well.

handoverReason example:

this.$send(CallTransferOutput, {
  transferTarget: '<sip:target>',
  transferNotifications: true,
  handoverReason: 'bot call handover',
  transferSipHeaders: [],
});

hangupReason example:

class Component extends BaseComponent {
  @Intents('EndConversationIntent')
  endConversation() {
    return this.$send(EndConversationOutput, {
      hangupReason: 'call successful',
      hangupSipHeaders: [],
    });
  }
}

Configuring session params

All sessionParams like Bot delay and DTMF should be configured using the SessionConfigOutput.sessionParams interface.. The config can be set once and it applies for the entire session. Example:

class Component extends BaseComponent {
  @Handle()
  welcome() {
    this.updateCallSettings({
      botNoInputTimeoutMS: 1000, // timeout before a prompt is played to the user.
      botNoInputSpeech: 'Please wait for bot input.', // textual prompt to play to the user when no input is received from the bot.
      botNoInputGiveUpTimeoutMS: 10000, // timeout response before the call is disconnected.
      botNoInputRetries: 2, // maximum number of recurring timeouts for bot response.
      botNoInputUrl: 'https://audiofile-to-play-when-bot-does-not-respond.url',
      resumeRecognitionTimeoutMS: 1000,
    });

    return this.$send(MessageActivityOutput, {
      message: 'What do you want to try?',
    });
  }
}

sendEventsToBot session-wide configuration for sending notification events to the bot. Similar to the eventParams attribute.

Handling inputs

The Jovo Audiocodes platform maps the different incoming events and notifications from the Audiocodes backend to an InputType. This way you can contrain handlers to trigger on certain event by using InputType. Events from the Audiocodes backend can have additional payloads with reside in the parameters or the value property. These properties are mapped into the this.$input object directly and because of that have the same structure as documented in the Audiocodes documentation. This means you can infer from the InputType which property should be populated and what structure resides under parameters or value.

DTMF input

When configured, the bot can respond to user's DTMF input. The sessionParams.sendDTMF boolean flag (true by default) is responsible for enabling/disabling DTMF collection. The session is configured by default to collect all digits the user inputs and if an input pauses for more than 2sec, the collected digits are sent to the bot.

To handle user's DTMF input a Jovo component has to handle the InputType.DTMF_EVENT event:

class Component extends BaseComponent {
  @Handle({ types: [InputType.DTMF_EVENT] })
  handleDTMF() {
    const userDtmf = this.jovo.$input.value; //dtmf input can be configured to deliver string or number
    if (userDtmf === '9') {
      this.$send(MessageActivityOutput, {
        message: '<speak> Sorry to see you leave.</speak>',
      });
      return this.endConversation();
    } else if (userDtmf === '11') {
      this.$send(MessageActivityOutput, {
        message: '<speak> You typed 11. Win!</speak>',
      });
    }
  }
}

Handling Audiocodes notifications

Audiocodes will send notifications when configured using the sendEventsToBot session or activity param. These events are mapped by the BotNotificationEvent enum to Jovo InputTypes:

InputType.LAUNCH maps start event
InputType.TEXT maps message event
InputType.DIALOUT_INITIATED maps dialoutInitiated event
InputType.NO_USER_INPUT maps noUserInput event
InputType.SPEECH_HYPOTHESIS maps speechHypothesis event
InputType.CONVERSATION_END maps conversationEnd event
InputType.CALL_TRANSFER_FAILED maps transferStatus event with status value failed
InputType.CALL_TRANSFER_ANSWERED maps transferStatus event with status value answered
InputType.PLAY_URL_FAILED maps playUrlFailed event

notifyParamChange.notifyParamChange configures whether to receive notifications when a call param changed. This attribute takes an array of parameters for which VoiceAI Connect notifies the bot whenever their values change. Doc

Usage example:

class Component extends BaseComponent {
  @Handle({
    types: [InputType.NO_USER_INPUT],
  })
  handleBotNotificationEvent() {
    return this.$send(MessageActivityOutput, { message: 'Please state your request.' });
  }
}

Check documentation for a detailed description for each notification.

Configuration

AudiocodesPlatform can be configured with the following attributes:

fallbackLanguage: the language to use when the requested language is not supported by the Audiocodes platform. Defaults to empty string.
fallbackTransferPhoneNumber: the fallback phone number to transfer the call if the intended transfer phone number fails. Needs to be preconfigured in the Audiocodes service. Defaults to empty string.
supportedLanguages: a list of locale+voiceName pairs preconfigured in the Audiocodes account. Defaults to empty list.
sessionParams: default session params to use for the entire session. The defaults are:

sessionParams: {
    botNoInputTimeoutMS: 6000,
    botNoInputGiveUpTimeoutMS: 30000,
    botNoInputRetries: 2,
    sendEventsToBot: [
        VoiceAiNotificationEvent.DIALOUT_INITIATED,
        VoiceAiNotificationEvent.NO_USER_INPUT,
        VoiceAiNotificationEvent.CONVERSATION_END,
        VoiceAiNotificationEvent.SPEECH_HYPOTHESIS,
    ],
    userNoInputGiveUpTimeoutMS: 60000,
    notifyParamChange: ['language'],
}

When instantiating the AudiocodesPlatform, the above defaults can be overridden by passing a sessionParams object to the constructor.

Example where the empty defaults are set and sessionParams.sendEventsToBot is overridden to an empty array (no events are sent to the bot):

app.configure({
  plugins: [
    // Add Jovo plugins here
    new AudiocodesPlatform({
      fallbackLocale: 'en-US',
      transferPhoneNumber: 'sip:+12345@sip-provider',
      supportedLanguages: [
        { locale: 'en-US', voiceName: 'en-US-GuyNeural' },
        { locale: 'de-DE', voiceName: 'de-DE-KatjaNeural' },
      ],
      sessionParams: {
        sendEventsToBot: [],
      },
      plugins: [...somePlugins],
    }),
  ],
});

Language handling

The conversation language can be set explicitly by passing a locale in the language attribute of an activity. The locale format is <language code>-<country code>, eg en-US or de-DE. If the desired language is not supported by the Audiocodes configuration, the AudiocodesPlatform config attribute fallbackLanguage is used. In order to set a desired language for an entire user session, set the locale in the AudiocodesResponse.sessionParams.language attribute.

Experimental features

Audiocodes requests handled by the Jovo service are mapped by AudiocodesEventActivityRequest. Audiocodes sends context-specific attributes with the request which are mapped by AudiocodesEventActivityRequest.parameters. Extra attributes can be send as well, depending on the admin configuration of Audiocodes.

For demonstration purposes, an additional request attribute called EventRequestParameters.sipUserToUseris also implemented. This maps the User-to-User SIP header which is sent by the Audiocodes platform during call initiation.