@sourcesync-sdk/whisper-web
v0.1.2
Published
## Overview The @sourcesync-sdk/whisper-web enables audio fingerprinting and matching capabilities in web applications. It allows developers to implement real-time audio capture, generate audio fingerprints, and match them against a database.
Downloads
99
Readme
@sourcesync-sdk/whisper-web
Overview
The @sourcesync-sdk/whisper-web enables audio fingerprinting and matching capabilities in web applications. It allows developers to implement real-time audio capture, generate audio fingerprints, and match them against a database.
Installation
To install the package, run the following command in your project directory:
npm install sourcesync-sdk
Or install a submodule as its own dependency:
npm install @sourcesync-sdk/whisper-web @sourcesync-sdk/app
Basic Usage
Here's a quick guide to get started with the Whisper Web SDK:
- Import and initialize the SDK:
import { createWhisperFactory } from 'sourcesync-sdk/whisper-web';
import { initializeApp } from 'sourcesync-sdk/app';
const app = await initializeApp({
appKey: 'your-app-key',
// app config
})
const whisperFactory = await createWhisperFactory(app, {
// set default options for Whisper instances
defaultOpts: {
apiUrl: 'YOUR_API_URL',
apiToken: 'YOUR_API_TOKEN',
apiKey: 'YOUR_API_KEY' // optional
}
});
- Create a Whisper instance:
const whisper = whisperFactory.create();
await whisper.init();
- Set up fingerprinting:
await whisper.setCallback((fingerprint) => {
// Handle the generated fingerprint
console.log('Fingerprint generated:', fingerprint);
// Request a match
whisper.requestMatch(fingerprint)
.then((matchResult) => {
console.log('Match result:', matchResult);
})
.catch((err) => {
// Handle errors
});
});
MatchResponse Object
The MatchResponse
object contains the following properties:
interface MatchResponse {
// device id from whisper platform if registered
wdeviceid: number
// session id from whisper platform if registered
wsessionid: number
// pass-through JSON string, you can store timestamps, custom device id's or anything
// you need to receive back with the match response in the match request
requestJson: string
// match object stores the matches grouped by type or an array of matches
matches: MatchGroups | MatchItem[]
}
type MatchGroups = Record<string, MatchItem[]>
interface MatchItem {
// reference content id as registered on the whisper platform
wrefid: number
// confidence score, 0 - 100, higher is better
confidence: number
// unknown time for the match based on the incoming fingerprint timestamp
unknowntime: string
// the time within the reference content that matched
referencetime: string
// JSON data that was registered with the content on the whisper platform
// this could be channel id's, custom content id's, title, description, etc.
refJson: string
title?: string
}
- Setup audio capture:
// attach the microphone
await whisper.attachMicrophone();
// or attach the video element
await whisper.attachVideoElement(videoElement);
// or attach the audio element
await whisper.attachAudioElement(audioElement);
- Start and stop audio capture:
// Start capturing
await whisper.start();
// Stop capturing
await whisper.stop();
Limitations and Requirements
Browser Audio/Video Access
For media resources to be accessible, the server must include the following header in its response:
Access-Control-Allow-Origin: *
Note: Using a wildcard (*) allows access from all origins. For production environments, consider specifying allowed origins explicitly for enhanced security.
WebAssembly (WASM) Requirements
When using the WASM build, specific headers are required to enable SharedArrayBuffers, which are necessary for the WASM runtime to support pthreads via WebWorkers.
Client Site (page embedding the WASM code)
The service must return these headers in the page response:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
WASM Host (e.g., CDN)
The service hosting the WASM code must return these headers in OPTIONS requests:
Access-Control-Allow-Origin: *
Cross-Origin-Resource-Policy: cross-origin
Important Notes
- These requirements may not cover all scenarios, especially when hosting components on different domains.
- Enabling these headers may potentially break existing website functionality.
- Without loading the WASM module, fingerprint generation in the browser is not possible.
Explanation of Headers
Cross-Origin-Opener-Policy: 'same-origin'
- Isolates the browsing context to same-origin documents
- Enhances security by preventing cross-origin popup manipulation
- Reduces the risk of side-channel attacks
Cross-Origin-Embedder-Policy: 'require-corp'
- Ensures cross-origin resources must explicitly grant permission to be loaded
- Enables the use of SharedArrayBuffer, crucial for multi-threaded WebAssembly
- Prevents unauthorized data access from other origins
These headers enable "cross-origin isolation", which is important for WebAssembly because it:
- Allows use of high-precision timers and SharedArrayBuffer, improving performance
- Creates a more isolated environment for WebAssembly code execution
- Enables advanced WebAssembly features like threads in most browsers
Trade-offs
- May break integrations with some third-party services relying on cross-origin access
- Can complicate embedding your content in other sites or vice versa
For full WebAssembly functionality, ensure all resources (including .wasm files) are served from the same origin or have appropriate CORS headers.
API Reference
WhisperFactory
create()
: Creates a new Whisper instance
WhisperWeb Instance
init()
: Initializes the Whisper instanceattachMicrophone()
: Attaches the microphone for audio captureattachVideoElement(videoElement)
: Attaches a video element for audio captureattachAudioElement(audioElement)
: Attaches an audio element for audio capturesetCallback(callback)
: Sets the callback for fingerprint generationstart()
: Starts audio capture and fingerprintingstop()
: Stops audio capturerequestMatch(fingerprint)
: Requests a match for the given fingerprint
Best Practices
- Ensure proper cleanup by calling
stop()
when audio capture is no longer needed
License
This SDK is distributed under the Apache License, Version 2.0. The Apache 2.0 License applies to the SDK only and not any other component of the SourceSync Platform.