npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

web-vad

v0.0.6

Published

Web Voice Activity Detection (VAD)

Downloads

393

Readme

Web Voice Activity Detection (VAD)

Adaption of @ricky0123's vad library that slightly shifts the API to only support passing a media stream, addresses some Typescript issues and reduces the codebase where possible. The primary purpose of this adaption is to support realtime voice agents, such as those provided by Pipecat.

Getting started

npm install onnxruntime-web web-vad

Copy Silero model somewhere accessible

Ensure silero_vad.onnx (included in this repo here) is hosted somewhere accessible (e.g. a public / static path.)

Ensure audio worker is available globally

Browsers ensure worklets cannot be imported as modules for safety reasons. Either import it with your framework specific syntax (e.g. import AudioWorkletURL from "web-vad/dist/worklet.js?worker&url";) or include it manually in a <script> declaration (at a higher order.)

Example project

An barebones example is included in this repo:

cd test-site
yarn
yarn run build # Copies onnx wasm to dist directory
yarn run dev

Navigate to the URL shown in your terminal

Usage

import { VAD } from "web-vad";
import AudioWorkletURL from "web-vad/dist/worklet.js?worker&url";


const localAudioTrack = ... // Get mic or other audio track
const stream = new MediaStream([localAudioTrack!]);

const vad = new VAD({
    workletURL: AudioWorkletURL,
    modelUrl: "path-to-silero.onnx",
    stream,
    onSpeechStart: () => {
        console.log("speaking start");
    },
    onVADMisfire: () => {
        console.log("misfire");
    },
    onSpeechEnd: () => {
        console.log("speaking end");
    },
});

// Initalize and load models
await vad.init();

// Start when ready
vad.start();

console.log(vad.state); 
// > VADState.listening

Next / Vite support

Web VAD uses WASM files provided by ONNX. Whilst these can be loaded at runtime, it is recommended to copy these files to your build / deployment. Here is an example vite.config.js that copies these files across at build time:

// vite.config.js

export default defineConfig({
  assetsInclude: ["**/*.onnx"],
  server: {
    headers: {
      "Cross-Origin-Embedder-Policy": "require-corp",
      "Cross-Origin-Opener-Policy": "same-origin",
    },
  },
  resolve: {
    alias: {
      "@": path.resolve(__dirname, "./src"),
    },
  },
  plugins: [
    viteStaticCopy({
      targets: [
        {
          src: "node_modules/onnxruntime-web/dist/*.wasm",
          dest: "./",
        },
      ],
    }),
  ],
});

Precaching models

Both the Silero.onnx and ONNX runtime wasms are quite large in size (~10mb). The VAD class exposes a static method for precaching these:

import {VAD} from "web-vad";

async function run() {
  console.log("Precaching models");
  await VAD.precacheModels("/silero-vad.onnx");
  console.log("Download complete!");
  
  //...start()
}

References

[1] Silero Team. (2021). Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier. GitHub, GitHub repository, https://github.com/snakers4/silero-vad, [email protected].

[2] Ricky Samore. Original code, https://github.com/ricky0123/vad, [email protected]