Signed-off-by: Marth64 <marth64@proxyid.net>
Raw Captions With Time (RCWT) is a format native to ccextractor, a commonly
used open source tool for processing 608/708 closed caption (CC) sources.
It can be used to archive the original, raw CC bitstream and to produce
a source file file for later CC processing or conversion. As a result,
it also allows for interopability with ccextractor for processing CC data
extracted via ffmpeg. The format is simple to parse and can be used
to retain all lines and variants of CC.
A free specification of RCWT can be found here:
https://github.com/CCExtractor/ccextractor/blob/master/docs/BINARY_FILE_FORMAT.TXT
This muxer implements the specification as of 01/05/2024, which has
been stable and unchanged for 10 years as of this writing.
This muxer will have some nuances from the way that ccextractor muxes RCWT.
No compatibility issues when processing the output with ccextractor
have been observed as a result of this so far, but mileage may vary
and outputs will not be a bit-exact match.
Specifically, the differences are:
(1) This muxer will identify as "FF" as the writing program identifier, so
as to be honest about the output's origin.
(2) ffmpeg's MPEG-1/2, H264, HEVC, etc. decoders extract closed captioning
data differently than ccextractor from embedded SEI/user data.
For example, DVD captioning bytes will be translated to ATSC A53 format.
This allows ffmpeg to handle 608/708 in a consistant way downstream.
This is a lossless conversion and the meaningful data is retained.
(3) This muxer will not alter the extracted data except to remove invalid
packets in between valid CC blocks. On the other hand, ccextractor
will by default remove mid-stream padding, and add padding at the end
of the stream (in order to convey the end time of the source video).