h264-interp-utils
v1.1.1
Published
H.264 data streams and metadata are tricky to interpret. This Javascript package helps.
Downloads
46
Maintainers
Readme
h264-interp-utils
H.264 bitstreams are tricky to handle. This Javascript package helps.
It handles the creation and parsing of H.264's codec-private data. This codec-private data is
stored in 'avcC'
atoms in MPEG-4 streams, and
in Tracks/Track/CodecPrivate
elements
in some Matroska streams (aka webm or EBML files).
It is sometimes necessary to re-create this codec-private data from elements in a
compressed video bitstream.
It handles the parsing of sequences of H.264 Network Access Layer Units (NALUs), formatted either in packet-transport or streaming Annex B format.
It offers functions for reading H.264's variable-length Exponential Golomb codes from its bitstream. With those functions it handles the parsing of Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) NALUs.
Install
Install with npm:
$ npm install --save h264-interp-utils
Installation with other package managers works similarly.
Why this package
The original reason to develop this package is to allow the reconstruction of
'avcC'
atom data from
MediaRecorder -emitted
data. When using MediaRecorder with a MIME type like video/webm; codecs="avc1.42C01E"
,
it generates a data stream without placing
codec-private data in
Tracks/Track/CodecPrivate
elements.
But, the experimental
WebCodecs browser API
requires that data to be passed to it in a config.description
element. Hence the need to reconstruct it.
MediaRecorder-emitted video streams repeat
the H.264 SPS and PPS NALUs at the beginning of the data for each intraframe.
In Matroska parlance, these are keyframe
s. In H.264 parlance they are I frames.
Each intraframe in simple low-latency MediaEncoder-emitted video streams
also happens to be an Instantaneous Decoder Refresh (IDR) frame;
decoding can begin at that point in the video stream without
reference to any previous data.
The AvcC
class in this package reconstructs the codec-private data from MediaRecorder's
intraframe data stream, by interpreting the SPS and PPS NALUs in that datastream.
Usage
Start by including the module in your program.
const H264Util = require('h264-interp-utils')
Bitstream
The Bitstream object allows its user to retrieve data bit-by-bit from arrays of data. It's used to parse NALUs, and supports the H.264 variable-length exponential Golomb coding for signed and unsigned integers.
You give the constructor an array containing a single NALU, without any leading NALU delimiter.
const bitstream = new H264Util.Bitstream(array)
const aBit = bitstream.u_1()
const nextBit = bitstream.u_1()
const twoBits = bitstream.u_2()
const fiveBits = bitstream.u(5)
const aByte = bitstream.u_8()
/* variable-length exponential-Golomb coded integers */
const unsignedInt = bitstream.ue_v()
const signedInt = bitstream.se_v()
You may retrieve the number of remaining bits in your array with const bitCount = bitstream.remaining
.
Likewise, you may retrieve the number of bits already consumed with const bitsUsed = bitstream.consumed
.
For debugging convenience, the bitstream.peek16
getter shows, in a text string, the next 16 bits.
Bitstream's constructor removes emulation prevention bytes from the array.
You may retrieve the stream, with emulation prevention bytes, with
const originalStream = bitstream.stream
You may write bits into the stream:
const dest = new Bitstream (4096)
dest.put_u_1(1) /* one bit */
dest.put_u_1(0) /* another bit */
const threebits = 6
dest.put_u(threebits, 3) /* three bits from a number */
const val = 7
let bitcount = dest.put_ue_v(val)/* exponential Golomb coded value, unsigned */
bitcount = dest.put_se_v(-val) /* exponential Golomb coded value, signed */
bitccount = dest.put_complete() /* done writing to Bitstream, tie it off */
You may copy bits from some other stream into the stream with copyBits()
:
const source = new Bitstream (nalu)
/* empty bitstream, same size as source */
const dest = new Bitstream(source.remaining)
const startAtBit = 0
const copyBitCount = source.remaining
dest.copyBits(source, startAtBit, copyBitCount)
copyBits()
is written to be reasonably efficient even for randomly
bit-aligned copies.
NALUStream
NALUStream accepts a buffer containing a sequence of NALUs. They may be in
- packet format, separated by 4, 3, or 2-byte NALU lengths
- AnnexB stream format, separated by four-byte
00 00 00 01
or three byte00 00 01
delimiters.
NALUStream's constructor takes both an array and an options object. The options object many contain any of these properties:
type
, if present, has the value'packet'
or'annexB'
. Use it to declare the format of your sequence of NALUs. If you omittype
, NALUStream attempts to determine the format by examining the first few NALUs. Try to avoid attempting that whenever possible.boxSize
, if present, can have values 4,3, or 2 for'packet'
streams, and 4 or 3 for'annexB'
streams. If you omitboxSize
NALUStream attempts to determine the boxSize by examining the first few NALUs. Try to avoid attempting that whenever possible.boxSizeMinusOne
can be provided in place ofboxSize
for compatibility with'avcC'
atoms.strict
, if true, makes NALUStream throw more errors when it detects anomalous data.
Constructing a NALUStream might look like this. It's wise to catch errors thrown by the constructor.
try {
const nalus = new H264Util.NALUStream(array, {type:'annexB', boxSize: 4, strict: true})
} catch (error) {
console.error(error)
}
You may iterate over the NALUs in a NALUStream like this:
for (const nalu of nalus) {
/* handle each NALU */
}
or this, if you want both the NALUs and their raw counterparts with the leading delimiters still present
for (const n of nalus.nalus()) {
const { nalu, rawNalu } = n
/* handle each nalu */
}
Somewhat less efficiently you can iterate like this:
const naluArray = nalus.packets
for (let i= 0; i < naluArray.length; i++) {
/* handle each naluArray[i] */
}
Some decoders (for example the VideoDecoder in WebCodecs) require their NALUs in packet format. You can convert a NALUStream to packet format like this. Notice that it changes the contents of the array passed in the constructor.
decoder.decode(nalus.convertToPacket())
NALUStream objects have type
, boxSize
, and boxSizeMinusOne
properties.
If you use the constructor to guess what sort of array you gave it,
you can retrieve its guesses with those properties.
NALUStream objects have the packetCount
property indicating how many NALUs are in the array.
SPS
SPS accepts a Stream Parameter Set NALU, and offers properties describing it. To construct an SPS object, give it an array containing a NALU. (It throws an error when you give it a NALU that's not an SPS, or that's garbled in a way that makes it impossible to decode.)
const sps = new H264Util.SPS(nalu)
Some of its useful properties are:
MIME
: the MIME type of the video stream, a value like'avc1.640029'
.profileName
: a human-readable value like'BASELINE'
or'EXTENDED'
indicating the codec profile.profile_idc
: the profile indicator. 66 means baseline, 77 means main, and 88 means extended.profile_compatibility
: the constraints.level_idc
: the level indicator for the codec level.picWidth
,picHeight
: the width and height of the pictures in the video stream.cropRect
: a rectangle object withx
,y
,width
,height
. In the cases where the pictures in the video stream have margins without imagery in them, thecropRect
defines the useful area.Because H.264 streams have sizes that are multiples of 16x16 macroblocks, it can be necessary to crop the pictures when rendering them.
It has what can only be described as a mess of properties defined by the H.264 standard and needed by the H.264 decoder to make sense of the stream.
PPS
PPS accepts a Picture Parameter Set NALU, and offers properties describing it. To construct an PPS object, give it an array containing a NALU. (It throws an error when you give it a NALU that's not an PPS, or that's garbled in a way that makes it impossible to decode.)
const pps = new H264Util.PPS(nalu)
Most PPS properties describe the format of the pictures in the video stream in a format useful to the H.264 decoder.
Two of its more useful properties are:
entropy_coding_mode_flag
: 0 for CALVC Huffman-style entropy coding, and 1 for CABAC Arps-style arithmetic coding.entropyCodingMode
:'CAVLC'
or'CABAC'
, a human-readable description of the entropy coding.
It has what can only be described as a mess of properties defined by the H.264 standard and needed by the H.264 decoder to make sense of the stream.
AvcC
H.264 defines a set of codec-private data describing the data stream. Not all video data streams have a distinct set of codec-private data: it's optional in Matroska / webm / .mkv video streams.
It contains, embedded in it, one or more SPS and PPS elements. It can be reconstructed by parsing an SPS and including a PPS.
The AvcC class parses and reconstructs the codec-private data.
A typical use case is, given arrays containing SPS and PPS NALUs, create an avcC object.
const avcCObject = new H264Util.AvcC({pps:ppsArray, sps:spsArray})
const mime = avcCObject.MIME
/* this is the binary array to put into the `'avcC'` atom. */
const codecPrivateDataArray = avCObject.avcC
Another typical use case is, given a key frame payload from a Matroska SimpleBlock, create the codec-private data.
const avcCObject = new H264Util.AvcC({bitstream: payload})
const mime = avcCObject.MIME
/* this is the binary array to put into the `'avcC'` atom. */
const codecPrivateDataArray = avCObject.avcC
Slice
Slice accepts I-frame (type 5) and P-frame (type 1) NALUs, and offers a few properties from decoding them.
To construct a Slice object, give it an array containing a NALU, and optionally an AvcC object created from the same data stream.
(It throws an error when you give it a NALU that's not type 1 or 5.)
const slice = new H264Util.Slice(nalu, avcC)
Some of its useful properties are:
first_mb_in_slice
: the number of the first macroblock in the present slice. This is zero for the first slice of a new frame.slice_type
: The type of this slice. 0,5: P slice. 1,6: B slice, 2,7: I slice, 3, 8: SP slice, 4,9: SI slice.frame_num
: The number of this frame in sequence after the most recent I-frame. This is only available if you provide anavcC
object.pic_parameter_set_id
: the index of the PPS describing this slice
It is sometimes necessary to change a slice's pic_parameter_set_id
. (A bug in Chrome's encoder makes it use
multiple different values.)
const fixedNalu = slice.setPPSId(0)
const fixedSlice = new H264Util.Slice(nalu, avcC)
const ppsId = fixedSlice.pic_parameter_set_id /* will be 0 */
Still to do
- Rework Bitstream and NALUStream to handle Javascript streams, not just static arrays of data.
Credits
Alex Izvorksi for his C++ h264bitstream code.
Of course, the legions of people who created H.264.