slate-transcript-editor-fork
v0.175.0
Published
A React component to make correcting automated transcriptions of audio and video easier and faster. Using the Slate Editor
Downloads
386
Keywords
Readme
Slate transcript editor
Work in progress
Building on the success and lessons learned from @bbc/react-transcript-editor
.
Mostly to be used in the context of autoEdit 3(digital paper edit), and other proejcts.
Criterias/Principles
- Easy to reason around
- Can handle transcript and media over 1 hour without loss in performance
- Only essential features for correction of timed text
- adapters to and from other STT services, are external, except for dpe (digital paper edit, adapter).
- leverages existing libraries, such as bootstrap, and react-bootstrap, to focus on the diffuclt problems, and not wasting time re-inventing the wheel or fiddling around with css.
See project board for more details of ongoing work.
See draftJs vs slateJs in doc/notes for some considerations that inspired this version.
Setup
git clone [email protected]:pietrop/slate-transcript-editor.git
cd slate-transcript-editor
npm install
Usage
Usage - dev
npm run storybook
or
npm start
Visit http://localhost:6006/
Usage - prod
npm install slate-transcript-editor
import SlateTranscriptEditor from 'slate-transcript-editor';
// you need to import bootstrap separatly
import 'bootstrap-css-only';
<SlateTranscriptEditor
mediaUrl={DEMO_MEDIA_URL_KATE}
transcriptData={DEMO_TRANSCRIPT_KATE}
handleSaveEditor=// optional - function to handle when user clicks save btn in the UI
/>
or with more options, see table below
See storybook *.stories.js
in src/components
/ for more examples
| Attributes | Description | required | type |
| :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :------: | :------: |
| transcriptData | Transcript json | yes | Json |
| mediaUrl | string url to media file - audio or video | yes | String |
| handleSaveEditor
| function to handle when user clicks save btn in the UI | no | Function |
| handleAutoSaveChanges
| returns content of transcription after there's a change, auto save has condierable performance lag on longer files, sudgest to not use for files over 45 min/1hour. | no | Function |
| autoSaveContentType
| specify the file format for data returned by handleAutoSaveChanges
and handleSaveEditor
,falls default to digitalpaperedit
, and runs alignement before export. Other option is slate
,without alignement. | no | String |
| isEditable
| set to true if you want to be able to edit the text | no | Boolean |
| showTimecodes
| set to true if you want to show timecodes in the transcript at paragraph level | no | Boolean |
| showSpeakers
| set to true if you want to show speaker labels in the transcript at paragraph level | no | Boolean |
| title
| defaults to empty String, also used in file names for exported files. | no | String |
| showTitle
| Whether to display the provided title | no | String |
| mediaType
| can be audio
or video
, if not provided, it defaults to video, but the component also uses the url file type to determine and adjust the player (.wav
, .mp3
,.m4a
,.flac
,.aiff
are recognised as audio ) | no | String |
see storybook for example code
System Architecture
Uses
align-diarized-text
for restoring timecodes. This lib combinedstt-align-node
withalignment-from-stt
to restore timecodes and preserve speaker labels.It
align-diarized-text
when export of formats that require timecodes, egdpe
json, ordocx
andtxt
with timecodes. Also for the 'realignement'/sync UI btn.If you export or save as slate json, at the moment it doesn't run alignement. The function to perform the alignement is also exported by the module, so that you can performe this computational intensive alignement elsewhere if needed, eg server side.
CSS
The project uses bootstrap, and react-bootstrap. And you'll need to include your own stylesheet in your React app.
npm install bootstrap-css-only
eg bootstrap-css-only is convinient because it doesn't ship with JQuery, that is not a dependency of react-bootstrap
and then import in your app
import 'bootstrap-css-only';
Alternativly this gives you the extra flexibility to write your own overriding the boostrap classes (see bootstrap and react-bootstrap on themeing) for more info.
Documentation
There's a docs folder in this repository.
- docs/notes contains dev draft notes on various aspects of the project. This would generally be converted either into ADRs or guides when ready.
- docs/guides contains walk through / how to.
- docs/adr contains Architecture Decision Record.
The docs folder syncs with gitbook to make the documentation more pleasent to browse at autoedit.gitbook.io/slate-transcript-editor-docs/ - Work in progress
Development env
- npm
6.13.6
- node
12
- storybook
If you have nvm you can run nvm use
to change to the node version for this repo.
Linting
This repo uses prettier for linting. If you are using visual code you can add the Prettier - Code formatter extension, and configure visual code to do things like format on save.
You can also run the linting via npm scripts
npm run lint
and there's also a pre-commit hook that runs it too.
Build
build module
Following storybook Distribute UI across an organization guide.
build storybook
npm run build-storybook
Tests
TBC
Deployment
Deployment module
To publish module to npm
npm run publish:public
and for a test run use
npm run publish:dry:run
Deployment storybook
To publish storybook to github pages
npm run deploy:ghpages