My Linux Video Toolchain
Ok, not perfect. But here are some notes about my way so far…
Concept You’ll want to compress the video & audio streams and wrap them in a flexible container, which should be capable of holding additional information like chapter markers and titles, audio subtitles, cover images, and archive information tags. For now, my choice is to encode the video with the H.264/AVC codec and use AAC for audio. Ideally, I would like to add subtitles as text in SubRip’s srt-Format. All this gets wrapped up in a Matroska stream container, which holds the meta information as well.
Backup So you got a DVD with some family videos from your last birthday and you want to back up the contents on your hard drive? First thing you do, is to copy everything as is — which may take a lot of disk space (up to 8 GB). I had good results with this simple tool:
dvdcpy -s skip -m -o My_Birthday /dev/sr0
After some time, you’ll find the DVD directory structure copied in your present working directory. With Fedora this handy gem comes in the
Compression Next comes the crucial part: compressing the raw data into something usable. After trying different options, I now rely on HandBrake for this, which in turn builds on the x264 library. Handbrake saves the streams directly into a Matroska container, allows editing chapter titles and adding extra subtitles from external files. It works very fast and comes with reasonable default options. Usually I stick with the Regular/Normal preset, just kicking out superfluous tracks and downmixing the audio to AAC 128bit Stereo. I let Handbrake figure out the video quality by setting the target size to 700 MB, which in most cases results in slightly degraded but more than watchable video quality around 1000 kbps. As I’m only interested in SD video quality, I usually choose a 624px-width based resolution. You might want to consult this detailed video conversion table.
Subtitles I would prefer to store subtitle data efficiently as text, i.e. using SubRip‘s srt format (while more advanced xml-based formats seem to lack acceptance). HandBrake does not mess with OCR (which is excusable), it extracts subtitles as bitmaps in VobSub format. The following command from the handy MKVToolnix package extracts from the mkv container the bitmaps and sync information in two files (*.vob and *.idx):
mkvextract tracks My_Birthday.mkv 3:subtitles.vob
These subtitles need to be scanned and ocr’ed: the best way to do this is Avidemux. Simpy hit “Tools > OCR(VobSub -> SRT)” and follow the instructions – be prepared though to invest some time helping with character recognition. Don’t forget to check the spelling in the resulting SRT file.
Re-Muxing Handbrake leaves us with a usable Matroska container, however we’d like to replace subtitles and add some additional data for archiving purposes: at least a cover image and descriptive tags. The MKVToolnix package mentioned above provides the excellent mkvmerge utility featuring a graphical interface. With mkvmerge we remove the VobSub tracks and add our new SRT file as a subtitle track. You might also want to attach a file
cover.jpg to the container. It will show up as the container’s thumbnail preview in a recent file browser. Moreover, we’d like to add some descriptive tags as suggested by the Matroska specification. However, I haven’t yet been able to find any player support for this feature. Anyway, for the tag file, you may use this template. Then merge the xml file into the container with mkvmerge (under the “global” tab for global tags).