jaxcore-deepspeech-plugin
v0.0.6
Published
Mozilla DeepSpeech speech recognition plugin for Jaxcore
Downloads
13
Readme
Jaxcore DeepSpeech Plugin
Jaxcore is an open source cybernetic control system. This plugin connects Mozilla DeepSpeech to Jaxcore to enable speech recognition support and voice control of any device or service that is connected.
Related projects:
- Jaxcore a cybernetic control library that manages services and devices, and connects them together using adapters
- BumbleBee-Hotword provides microphone recording support and hotword detection
- Jaxcore Say provides text-to-speech (speech synthesis) support
- Jaxcore Desktop Server - desktop application server for Windows/MacOS/Linux that runs Jaxcore and makes all services and adapters available through a simple UI
- Jaxcore Browser Extension a web browser extension that allows the DeepSpeech plugin to connect to and control web pages
Together, these tools provide JavaScript developers an easy way write "Alexa-like" interactive voice assistants, smart-home controls, and create science-fiction like voice-controlled web applications and games. Run everything privately on your local computer without any 3rd party cloud computing services required.
Install
npm install jaxcore-deepspeech-plugin
To install from source and try the examples:
git clone https://github.com/jaxcore/deepspeech-plugin
cd deepspeech-plugin
npm install
Download DeepSpeech Pre-Trained english model:
All the examples require the DeepSpeech english model to be at the root of the project.
# enter project directory
cd deepspeech-plugin
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/deepspeech-0.6.0-models.tar.gz
tar xfvz deepspeech-0.6.0-models.tar.gz
rm deepspeech-0.6.0-models.tar.gz
If you have previously download the models a softlink can be made:
ln -s /path/to/deepspeech/models
Examples
The examples provided will demonstrate the capabilities and limitations of the system, and provide a good place to start when writing your own "voice apps".
NodeJS Examples
These examples run directly in NodeJS:
- Microphone example - basic example of recording a microphone and streaming to DeepSpeech
- Wake Word example - uses hotword detection to activate/deactivate DeepSpeech
- Knock, Knock Jokes example - an interactive voice chatbot that tells knock, knock jokes
Jaxcore Control Examples
These are more advanced NodeJS examples which use Jaxcore to control other devices and network services:
- Voice Assistant Toolbox - a collection of tools needed to create a voice assistant of your own, includes hotword detection, text-to-speech, and speech-to-text all working at the same time
- Mouse Control Example - uses voice commands to control the mouse (eg. mouse up 100, left click, scroll down...)
- Kodi Control Example - uses voice commands to control Kodi Media Center navigation and playback (eg. play, pause, select, back, up, down, page up, page down...)
- Number Typer - numbers and symbols that you speak will be typed on the keyboard
Web Examples
These use a ReactJS client to stream microphone audio from the browser to a NodeJS server running DeepSpeech:
- Web Basic example simple example of recording and speech recognition
- Web Hotword example advanced example using hotword detection, audio visualization, and voice activated menus
Electron Example
- Electron example runs DeepSpeech and BumbleBee inside an ElectronJS desktop application
Jaxcore Browser Extension Examples
These require running the Jaxcore Desktop Server and web browser extension. This method allows developers to write voice-enabled web applications using only client-side JavaScript. The Jaxcore application provides the speech recognition support from outside the browser.
- coming soon
API
This DeepSpeech plugin does not provide any audio recording functionality of it's own. The purpose of this library is to use VAD (voice activity detection) to stream audio data to an instance of DeepSpeech running in a background thread (fork) in the best way possible.
It is recommended to use BumbleBee Hotword or the NodeJS version of BumbleBee to provide record the microphone audio. These libraries have been tweaked specifically to work with DeepSpeech and has Porcupine hotword detection built-in for wake-word support.
The examples above demonstrate different ways to run BumbleBee to record and stream microphone audio into DeepSpeech.
For NodeJS, this is a basic way:
const Jaxcore = require('jaxcore');
const jaxcore = new Jaxcore();
jaxcore.addPlugin(require('jaxcore-deepspeech-plugin'));
const BumbleBee = require('bumblebee-hotword-node');
const bumblebee = new BumbleBee();
bumblebee.addHotword('bumblebee');
const MODEL_PATH = process.env.DEEPSPEECH_MODEL || __dirname + '/../../deepspeech-0.6.0-models'; // path to deepspeech model
jaxcore.startService('deepspeech', {
modelName: 'english',
modelPath: MODEL_PATH,
silencThreshold: 200, // delay for this long before processing the audio
vadMode: 'VERY_AGGRESSIVE', // 'AGGRESSIVE' or 'VERY_AGGRESSIVE' is recommended
}, function(err, deepspeech) {
// receive the speech recognition results
deepspeech.on('recognize', (text, stats) => {
console.log('recognize:', text, stats);
});
// bumblebee emits a "data" event for every 8192 bytes of audio it records from the microphone
bumblebee.on('data', function(data) {
// stream the data to the deepspeech plugin
deepspeech.streamData(data);
});
// bumblebee start the microphone
bumblebee.start();
});
The audio data streamed to DeepSpeech using deepspeech.streamData(data);
Does not specifically have to be from a microphone using BumbleBee, the data can be any PCM integer 16 bit 16khz
stream from any source.
To receive microphone audio from the browser through a websocket server, see the Web Basic example.
API Methods
These methods are used to receive audio data from the browser or from an ElectronJS window:
Stream an audio buffer to the deepspeech plugin:
deepspeech.streamData(data);
End the stream:
deepspeech.streamEnd();
End the stream and ignore deepspeech results;
deepspeech.streamReset();
Events
"recognize"
Receives the speech recognition results from DeepSpeech:
deepspeech.on('recognize', (text, stats) => {
console('recognize', text, stats);
});
License
Change Log
0.0.6:
- refactored VAD logic out of the DeepSpeech process, this improves accuracy during short pauses between words
- added the Number Typer keyboard example
- update the voice assistant and mouse examples to the newest Jaxcore API