@jackdbd/eleventy-plugin-text-to-speech
v3.2.0
Published
Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.
Downloads
231
Readme
@jackdbd/eleventy-plugin-text-to-speech
Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.
- Installation
- About
- Docs
- Preliminary Operations
- Usage
- Configuration
- Troubleshooting
- Dependencies
- Credits
- License
Installation
npm install @jackdbd/eleventy-plugin-text-to-speech
Note: this library was tested on Node.js >=18. It might work on other Node.js versions though.
About
Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.
To synthesize text into speech you can use:
To host the generated audio assets you can use:
- Cloud Storage
- Cloudflare R2
- Filesystem (self host your audio assets)
:warning: The Cloud Text-to-Speech API has a limit of 5000 characters.
See also:
Docs
:open_book: API Docs
This project uses API Extractor and api-documenter markdown to generate a bunch of markdown files and a
.d.ts
rollup file containing all type definitions consolidated into a single file. I don't find this.d.ts
rollup file particularly useful. On the other hand, the markdown files that api-documenter generates are quite handy when reviewing the public API of this project.See Generating API docs if you want to know more.
Preliminary Operations
Enable the Text-to-Speech API
Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:
gcloud services enable texttospeech.googleapis.com
Set up authentication via a service account
This plugin uses the official Node.js client library for the Text-to-Speech API. In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.
First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the Storage Object Admin predefined IAM role.
gcloud iam service-accounts create sa-text-to-speech-user \
--display-name "Text-to-Speech user SA"
Second, download the JSON key of this service account and store it somewhere safe. Do not track this file in git.
Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)
Create a Cloud Storage bucket in your desired location. Enable uniform bucket-level access and use the nearline
storage class.
gsutil mb \
-p $GCP_PROJECT_ID \
-l $CLOUD_STORAGE_LOCATION \
-c nearline \
-b on \
gs://bkt-eleventy-plugin-text-to-speech-audio-files
If you want, you can check that uniform bucket-level access is enabled using this command:
gsutil uniformbucketlevelaccess get \
gs://bkt-eleventy-plugin-text-to-speech-audio-files
Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):
gsutil iam ch allUsers:objectViewer \
gs://bkt-eleventy-plugin-text-to-speech-audio-files
Usage
Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the environment variable CF_PAGES_URL
.
Self-hosting the generated audio assets
If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:
import { textToSpeechPlugin } from '@jackdbd/eleventy-plugin-text-to-speech'
export default function (eleventyConfig) {
// some eleventy configuration...
eleventyConfig.addPlugin(textToSpeechPlugin, {
// TODO: add config with process.env.CF_PAGES_URL here
})
// some more eleventy configuration...
}
Hosting the generated audio assets on Cloud Storage
If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:
import { textToSpeechPlugin } from '@jackdbd/eleventy-plugin-text-to-speech'
export default function (eleventyConfig) {
// some eleventy configuration...
eleventyConfig.addPlugin(textToSpeechPlugin, {
// TODO: add config with Cloud Storage bucket here
})
// some more eleventy configuration...
}
Multiple hosts
If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:
- Self-host some audio assets, and host on a Cloud Storage bucket some other assets.
- Host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.
Have a look at the Eleventy configuration of the demo-site in this monorepo.
Configuration
Plugin options
| Key | Default | Description |
|---|---|---|
| collectionName
| undefined
| Name of the 11ty collection defined by this plugin |
| rules
| undefined
| Rules that determine which texts to convert into speech |
| transformName
| undefined
| Name of the 11ty transform defined by this plugin |
Rule
| Key | Default | Description |
|---|---|---|
| audioInnerHTML
| undefined
| Function that returns some HTML from the list of hrefs where the generated audio assets are hosted. |
| cssSelectors
| undefined
| CSS selectors to find matches in a HTML document |
| hosting
| undefined
| Client that provides hosting capabilities |
| regex
| undefined
| RegExp to find matches in the output path |
| synthesis
| undefined
| Client that provides Text-to-Speech capabilities |
| xPathExpressions
| undefined
| XPath expressions to find matches in a HTML document |
Troubleshooting
This plugin uses the debug library for logging.
You can control what's logged using the DEBUG
environment variable.
For example, if you set your environment variables in a .envrc
file, you can do:
# print all logging statements
export DEBUG=11ty-plugin:*
Dependencies
| Package | Version |
|---|---|
| @jackdbd/zod-schemas | ^2.2.0
|
| html-to-text | ^9.0.5
|
| jsdom | ^24.0.0
|
| specificity | ^1.0.0
|
| zod | ^3.23.0
|
| zod-validation-error | ^3.1.0
|
⚠️ Peer Dependencies
This package defines 6 peer dependencies.
| Peer | Version range |
|---|---|
| @11ty/eleventy
| >=2.0.0 or 3.0.0-alpha.6
|
| @aws-sdk/client-s3
| >=3.0.0
|
| @aws-sdk/lib-storage
| >=3.0.0
|
| @google-cloud/storage
| >=7.0.0
|
| @google-cloud/text-to-speech
| >=5.0.0
|
| debug
| >=4.0.0
|
Credits
I had the idea of this plugin while reading the code of the homonym eleventy-plugin-text-to-speech by Larry Hudson. Larry's plugin uses the Microsoft Azure Speech SDK.
License
© 2022 - 2024 Giacomo Debidda // MIT License