whisper-node-ts
v0.0.16
Published
Node bindings for OpenAI's Whisper. Optimized for CPU.
Downloads
62
Maintainers
Readme
whisper-node-ts
Node.js bindings for OpenAI's Whisper.
Base on whisper-node
Features
- Output transcripts to JSON (also .txt .srt .vtt)
- Optimized for CPU (Including Apple Silicon ARM)
- Timestamp precision to single word
Installation
- Add dependency to project
npm install whisper-node-ts
- Download whisper model of choice
npx whisper-node-ts download
Usage
import whisper from "whisper-node-ts";
const transcript = await whisper("example/sample.wav");
console.log(transcript); // output: [ {start,end,speech} ]
Output (JSON)
[
{
start: "00:00:14.310", // time stamp begin
end: "00:00:16.480", // time stamp end
speech: "howdy" // transcription
}
];
Usage with Additional Options
import whisper from 'whisper-node-ts';
const filePath = "example/sample.wav", // required
const options = {
modelName: "tiny.en", // default
modelPath: "/custom/path/to/model.bin", // use model in a custom directory
whisperOptions: {
gen_file_txt: false, // outputs .txt file
gen_file_subtitle: false, // outputs .srt file
gen_file_vtt: false, // outputs .vtt file
timestamp_size: 10, // amount of dialogue per timestamp pair
word_timestamps: true // timestamp for every word
}
}
const transcript = await whisper(filePath, options);
Made with
Roadmap
- [x] Support projects not using Typescript
- [x] Allow custom directory for storing models
- [ ] Config files as alternative to model download cli
- [ ] Remove path, shelljs and prompt-sync package for browser, react-native expo, and webassembly compatibility
- [ ] fluent-ffmpeg to support more audio formats
- [ ] Pyanote diarization for speaker names
- [ ] Implement WhisperX as optional alternative model for diarization and higher precision timestamps (as alternative to C++ version)
Modifying whisper-node-ts
npm run dev
- runs nodemon and tsc on '/src/test.ts'
npm run build
- runs tsc, outputs to '/dist' and gives sh permission to 'dist/download.js'