chrome-based-voice-recognition-api

v2.0.3

Published

2 years ago

An API used to hijack chrome webkitSpeechRecognition for non-web applications.

Downloads

0High
0Medium
0Low

erwidman

chrome voice-recognition

An API used to hijack chrome webkitSpeechRecognition for non-web applications.

Description

This API intends to create a binding to Chrome SpeechRecognition API for native node processes outside web environments. The use of Chrome's voice recognition as opposed to a paid cloud service offers an inexpensive, easy to install, fast alternative for small projects that don't justify the hassle of setting up a cloud service.

Caveats

The original concept involved a headless chrome instance, however due to a number of issues - it does not seem reasonable to accomplish headless as it will require hacking around the chromium source code. In the meantime, the API simply opens a small chrome window.

In order for the API to operate one of the follow must be a valid environment variable pointing to the chrome executable:

chrome
chromium-browser
Google\ Chrome
google-chrome

Usage

const recog = require('chrome-based-voice-recognition-api');

recog.launchSpeechRecognition(12000)
.then((data)=>{
	let client = data.client;
	recog.initSpeechRecognition({
		lang : 'en-us',
		interimResults : false,
		maxAlternatives : 1,
		infiniteListen : true
	},client);
	recog.addFilter("^lights on$",client);
	client.on('recognition',(data)=>{
		//todo
	});
	client.on('startedListen',()=>{
		//todo
	});
	client.on('stopedListen',()=>{
		//todo
	})
	client.on('recognitionError',(err)=>{
		//todo
	})
	recog.startListen(client);
});

API

####launchSpeechRecogition(port)

#####Description Spins up the websocket server and chrome instance.

#####Params port (number) : The desired port of page server

Note : port + 1 will be used for websocket server

#####Returns An object with two properties -

client : an instace of simple-websocket handling IPC
serv : an instance of simple-websocket/server

####initSpeechRecognition(options,client)

#####Description Instantiates webkit speech recog in chrome.

#####Params options : An object with settings to be applied to speech object.

All are optional.

lang (string): The language to be recognized
interimResults (boolean) : Indication that you wish to recieve low confidence results.
maxAlternatives (number) : The max number of alternatives sent from recog.
infiniteListen (boolean) : If true, will continue listening after a hit.