npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

react-hook-speech-to-text

v0.8.0

Published

Simple cross-browser speech to text using react hooks.

Downloads

9,086

Readme

React Hook to transcribe microphone input into text.

Demo

This hook supports cross browser speech to text if the crossBrowser: true prop is passed.

By default, the SpeechRecognition API is used which is currently only supported by Chrome browsers.

The SpeechRecognition API does not require any additional setup or API keys, everything works out of the box.

Why use this package?

  • Written in TypeScript with full type support
  • 0 dependencies
  • Simple APIs
  • Cross-browser optional support
  • Small bundle size 11.9kB (4.7kB gzipped)

Install

npm i react-hook-speech-to-text
yarn add react-hook-speech-to-text

Live Demo

https://trusting-perlman-d246f0.netlify.app/

As of 02/02/20 cross-browser speech to text tested and working on Chrome, firefox, Safari for Mac, iOS 13 and 14 Safari. Have not tested on Android but should work on chromium based android browsers.

Unfortunately does not work on firefox or chrome on iOS due to apple only supporting getUserMedia on safari (thanks apple 💩)

Cross Browser Support

If cross-browser support is needed, the crossBrowser: true prop must be passed. A Google Cloud Speech-to-Text API key is needed.

This hook makes use of a customized version of recorder.js for recording audio, down-sampling the audio sampleRate to <= 48000hz, and converting that audio to WAV format.

The hook then converts the WAV audio blob returned from recorder.js and converts it into a base64 string using the FileReader API. This is all needed in order to POST audio data to the Google Cloud Speech-to-Text REST API and get transcribed text returned all on the front-end.

Also used is hark.js for detecting start and stopped speech events for browsers that don't support the SpeechRecognition API. If the SpeechRecognition API is available, SpeechRecognition API handles start and stop speech events automatically.

Hook usage

import React from 'react';
import useSpeechToText from 'react-hook-speech-to-text';

export default function AnyComponent() {
  const {
    error,
    interimResult,
    isRecording,
    results,
    startSpeechToText,
    stopSpeechToText,
  } = useSpeechToText({
    continuous: true,
    useLegacyResults: false
  });

  if (error) return <p>Web Speech API is not available in this browser 🤷‍</p>;

  return (
    <div>
      <h1>Recording: {isRecording.toString()}</h1>
      <button onClick={isRecording ? stopSpeechToText : startSpeechToText}>
        {isRecording ? 'Stop Recording' : 'Start Recording'}
      </button>
      <ul>
        {results.map((result) => (
          <li key={result.timestamp}>{result.transcript}</li>
        ))}
        {interimResult && <li>{interimResult}</li>}
      </ul>
    </div>
  );
}

Cross-browser mode

Same code as above with crossBrowser: true and googleApiKey props passed

  const {
    error,
    isRecording,
    results,
    startSpeechToText,
    stopSpeechToText,
  } = useSpeechToText({
    continuous: true,
    crossBrowser: true,
    googleApiKey: YOUR_GOOGLE_CLOUD_API_KEY_HERE,
    useLegacyResults: false
});

Arguments

Arguments passed to useSpeechToText invocation

timeout

Sets amount in milliseconds to stop recording microphone input after last detected speech event.

Note: timeout is only applied when crossBrowser: true arg is passed and using browser that does not support SpeechRecognition API, else Chrome's Speech Recognition automatically stops recording after a certain amount of time.

  • Type: number
  • Required: false
  • Default: 10000

continuous

Sets whether or not to keep recording microphone input after speech results are returned

  • Type: boolean
  • Required: false
  • Default: undefined

onStartSpeaking

Callback function invoked on speech detection event. Only invoked on browsers that do not support SpeechRecognition API

  • Type: () => any
  • Required: false
  • Default: undefined

onStopSpeaking

Callback function invoked on speech stopped event. Only invoked on browsers that do not support SpeechRecognition API

  • Type: () => any
  • Required: false
  • Default: undefined

crossBrowser

Boolean to enable speech to text functionality on browsers that do not support the SpeechRecognition API. Firefox, Safari, iOS Safari etc.

Note: googleApiKey is required if using crossBrowser mode.

  • Type: boolean
  • Required: false
  • Default: false

googleApiKey

API key used for Google Cloud Speech to text API for cross browser speech to text functionality.

  • Type: string
  • Required: true if crossBrowser or useOnlyGoogleCloud
  • Default: undefined

googleCloudRecognitionConfig

Thanks to https://github.com/iwgx for the suggestion https://github.com/Riley-Brown/react-speech-to-text/issues/2

Optional config object to have more control over the google cloud speech settings. These options will only be passed if using crossBrowser: true on browsers not supporting SpeechRecognition API, or on all browsers if passing useOnlyGoogleCloud: true. All options can be found here https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig

  • Type: GoogleCloudRecognitionConfig
  • Required: false
  • Default: undefined

ex:

const { results } = useSpeechToText({
  crossBrowser: true,
  googleApiKey: process.env.REACT_APP_API_KEY,
  googleCloudRecognitionConfig: {
    languageCode: 'en-US'
  }
});

speechRecognitionProperties

Optional object of properties to have more control over Google Chrome's SpeechRecognition API. These properties only apply on browsers that support Google Chrome's SpeechRecognition API. All properties can be found here https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition#properties

  • Type: SpeechRecognitionProperties
  • Required: false
  • Default: { interimResults: true }

ex:

const { results, interimResult } = useSpeechToText({
  speechRecognitionProperties: {
    lang: 'en-US',
    interimResults: true // Allows for displaying real-time speech results
  }
});

useOnlyGoogleCloud

Boolean to force use of google cloud engine speech-to-text even on browsers that support built in SpeechRecognition API. By default, this package will try to use SpeechRecognition where available since it is free for Chrome and will provide faster results than the cross-browser google cloud engine speech-to-text method. Overwrite this functionality if you want to only use google cloud for more consistent functionality.

Note: if setting to true, you must pass a valid google cloud speech to text API key or no results will be returned.

  • Type: boolean
  • Required: false
  • Default: false

useLegacyResults

Boolean to return legacy array of string results or the new array of object results. Recommended to pass useLegacyResults: false as the legacy array of strings results will be removed in a future version.

  • Type: boolean
  • Required: false
  • Default: false

Returned Values

Values returns by the useSpeechToText() hook

ex: const { results } = useSpeechToText()

results

Transcribed text from speech on successful speech-to-text transcription.

  • Type: string[] | ResultType[]
  • Default: []
type ResultType = {
  speechBlob?: Blob;
  timestamp: number;
  transcript: string;
};

Important: When passing useLegacyResults: false, results will return the new ResultType[] array of objects. It is recommended to opt into the new results ASAP as the legacy results will be completely removed in a future version.

Note: speechBlob will only be returned when using google cloud speech to text API as Chrome's SpeechRecognition API does not provide access to the captured microphone data

setResults

React setState function to manually set the results state array

Thanks to https://github.com/marharyta for the suggestion https://github.com/Riley-Brown/react-speech-to-text/issues/12

  • Type: React.Dispatch<React.SetStateAction<ResultType[]>>

interimResult

Real-time speech result only for SpeechRecognition web API if opting-in using speechRecognitionProperties: { interimResults: true } config prop. Will update using the partial results returned from the web API and gets set back to undefined on final speech result.

  • Type: string | undefined
  • Default: undefined

startSpeechToText

Function to start microphone input recording. Will prompt user with microphone access permission if not previously granted.

  • Type: () => Promise<void>

stopSpeechToText

Function to stop microphone input recording.

  • Type: () => void

isRecording

Boolean to detect if an active recording is occurring.

  • Type: boolean
  • Default: false

error

Error string if feature is not supported on current browser

  • Type: string
  • Default: ''

Running project locally

npm run start

Building project locally

Code will be output to the /dist folder using rollup bundler

npm run build

Contributing

Feel free to open an issue on the repo with any bugs or feature requests, or fork and create a PR with fixes, features or improvements.