npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

echogarden

v2.0.3

Published

An easy-to-use speech toolset. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.

Downloads

3,634

Readme

Echogarden

Echogarden is an easy-to-use speech toolset that includes a variety of speech processing tools.

  • Easy to install, run, and update
  • Runs on Windows (x64, ARM64), macOS (x64, ARM64) and Linux (x64, ARM64)
  • Written in TypeScript, for the Node.js runtime
  • Doesn't require Python, Docker, or other system-level dependencies
  • Doesn't rely on essential platform-specific binaries. Engines are either ported via WebAssembly, imported using the ONNX runtime, or written in pure JavaScript

Features

  • Text-to-speech using the VITS neural architecture, and 15 other offline and online engines, including cloud services by Google, Microsoft, Amazon, OpenAI and Elevenlabs
  • Speech-to-text using a built-in JavaScript/ONNX port of the OpenAI Whisper speech recognition architecture, whisper.cpp, and several other engines, including cloud services by Google, Microsoft, Amazon and OpenAI
  • Speech-to-transcript alignment using several variants of dynamic time warping (DTW, DTW-RA), including support for multi-pass (hierarchical) processing, or via guided decoding using Whisper recognition models. Supports 100+ languages
  • Speech-to-text translation, translates speech in any of the 98 languages supported by Whisper, to English, with near word-level timing for the translated transcript
  • Speech-to-translated-transcript alignment synchronizes spoken audio in one language, to a provided English-translated transcript, using the Whisper engine
  • Speech-to-transcript-and-translation alignment synchronizes spoken audio in one language, to a translation in a variety of other languages, given both a transcript and its translation
  • Text-to-text translation, translates text between various languages. Supports cloud-based Google Translate engine
  • Language detection identifies the language of a given audio or text. Includes Whisper or Silero engines for spoken audio, and TinyLD or FastText for text
  • Voice activity detection attempts to identify segments of audio where voice is active or inactive. Includes WebRTC VAD, Silero VAD, RNNoise-based VAD and a built-in Adaptive Gate algorithm
  • Speech denoising attenuates background noise from spoken audio. Includes the RNNoise and NSNet2 engines
  • Source separation isolates voice from any music or background ambience. Supports the MDX-NET deep learning architecture
  • Word-level timestamps for all recognition, synthesis, alignment and translation outputs
  • Advanced subtitle generation, accounting for sentence and phrase boundaries
  • For the VITS and eSpeak-NG synthesis engines, includes enhancements to improve TTS pronunciation accuracy: adds text normalization (e.g. idiomatic date and currency pronunciation), heteronym disambiguation (based on a rule-based model) and user-customizable pronunciation lexicons
  • Internal package system that auto-downloads and installs voices, models and other resources, as needed

Installation

Ensure you have Node.js v18.16.0 or later installed.

then:

npm install echogarden -g

Updating to latest version

npm update echogarden -g

Using the command-line interface

A small sample of command lines:

echogarden speak "Hello World!"
echogarden speak-file story.txt
echogarden transcribe speech.mp3
echogarden align speech.opus transcript.txt
echogarden isolate speech.wav

See the Command-line interface guide for more details on the operations supported, and the configuration options reference for a comprehensive list of all options supported.

Note: on v2.0.0, a newly developed audio playback library was integrated to the CLI interface. If you're having trouble hearing sound, or the sound is distorted, please report this as an issue. You can switch back to the older SoX based player by adding --player=sox to the command-line. On macOS, you'll need to ensure SoX is installed in path by installing it with a system package manager like Homebrew (brew install sox).

Using the Node.js API

If you are a developer, you can also import the package as a module. The API operations and options closely mirror the CLI.

Documentation

Credits

This project consolidates, and builds upon the effort of many different individuals and companies, as well as contributing a number of original works.

Developed by Rotem Dan (IPA: /ˈʁɒːtem ˈdän/).

License

GNU General Public License v3

Licenses for components, models and other dependencies are detailed on this page.