npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

offmute

v0.0.4

Published

An experiment in meeting transcription and diarization with just an LLM.

Downloads

326

Readme

npx offmute 🎙️

NPM version License

Intelligent meeting transcription and analysis using Google's Gemini models

FeaturesQuick StartInstallationUsageAdvancedHow It Works

🚀 Features

  • 🎯 Transcription & Diarization: Convert audio/video content to text while identifying different speakers
  • 🎭 Smart Speaker Identification: Attempts to identify speakers by name and role when possible
  • 📊 Meeting Reports: Generates structured reports with key points, action items, and participant profiles
  • 🎬 Video Analysis: Extracts and analyzes visual information from video meetings, understand when demos are beign didsplayed
  • Multiple Processing Tiers: From budget-friendly to premium processing options
  • 🔄 Robust Processing: Handles long meetings with automatic chunking and proper cleanup
  • 📁 Flexible Output: Markdown-formatted transcripts and reports with optional intermediate outputs

🏃 Quick Start

# Set your Gemini API key
export GEMINI_API_KEY=your_key_here

# Run on a meeting recording
npx offmute path/to/your/meeting.mp4

📦 Installation

As a CLI Tool

npx offmute <Meeting_Location> <options>

As a Package

npm install offmute

Get Help

npx offmute --help

bunx or bun works faster if you have it!

💻 Usage

Command Line Interface

npx offmute <input-file> [options]

Options:

  • -t, --tier <tier>: Processing tier (first, business, economy, budget) [default: "business"]
  • -a, --all: Save all intermediate outputs
  • -sc, --screenshot-count <number>: Number of screenshots to extract [default: 4]
  • -ac, --audio-chunk-minutes <number>: Length of audio chunks in minutes [default: 10]
  • -r, --report: Generate a structured meeting report
  • -rd, --reports-dir <path>: Custom directory for report output

Processing Tiers

  • First Tier (first): Pro models for all operations
  • Business Tier (business): Pro for description, Flash for transcription
  • Economy Tier (economy): Flash models for all operations
  • Budget Tier (budget): Flash for description, 8B for transcription

As a Module

import {
  generateDescription,
  generateTranscription,
  generateReport,
} from "offmute";

// Generate description and transcription
const description = await generateDescription(inputFile, {
  screenshotModel: "gemini-1.5-pro",
  audioModel: "gemini-1.5-pro",
  mergeModel: "gemini-1.5-pro",
  showProgress: true,
});

const transcription = await generateTranscription(inputFile, description, {
  transcriptionModel: "gemini-1.5-pro",
  showProgress: true,
});

// Generate a structured report
const report = await generateReport(
  description.finalDescription,
  transcription.chunkTranscriptions.join("\n\n"),
  {
    model: "gemini-1.5-pro",
    reportName: "meeting_summary",
    showProgress: true,
  }
);

🔧 Advanced Usage

Intermediate Outputs

When run with the -a flag, offmute saves intermediate processing files:

input_file_intermediates/
├── screenshots/          # Video screenshots
├── audio/               # Processed audio chunks
├── transcription/       # Per-chunk transcriptions
└── report/             # Report generation data

Custom Chunk Sizes

Adjust processing for different content types:

# Longer chunks for presentations
offmute presentation.mp4 -ac 20

# More screenshots for visual-heavy content
offmute workshop.mp4 -sc 8

⚙️ How It Works

offmute uses a multi-stage pipeline:

  1. Content Analysis

    • Extracts screenshots from videos at key moments
    • Chunks audio into processable segments
    • Generates initial descriptions of visual and audio content
  2. Transcription & Diarization

    • Processes audio chunks with context awareness
    • Identifies and labels speakers
    • Maintains conversation flow across chunks
  3. Report Generation (Spreadfill)

    • Uses a unique "Spreadfill" technique:
      1. Generates report structure with section headings
      2. Fills each section independently using full context
      3. Ensures coherent narrative while maintaining detailed coverage

Spreadfill Technique

The Spreadfill approach helps maintain consistency while allowing detailed analysis:

// 1. Generate structure
const structure = await generateHeadings(description, transcript);

// 2. Fill sections independently
const sections = await Promise.all(
  structure.sections.map((section) => generateSection(section, fullContext))
);

// 3. Combine into coherent report
const report = combineResults(sections);

🛠️ Requirements

  • Node.js 14 or later
  • ffmpeg installed on your system
  • Google Gemini API key

Contributing

You can start in TODOs.md to help with things I'm thinking about, or you can steel yourself and check out PROBLEMS.md.

Created by Hrishi Olickel • Support offmute by starring our GitHub repository