Extracting subtitles with FFmpeg

FFMpeg is the swiss army knife of video editors and motion artists worldwide. Let’s take a look at one of the lesser-known (and kinda unexpected) features.

FFmpeg can easily extract embedded subtitles from videos. This command will grab the default subtitle track and export it as a srt file:

ffmpeg -i input_file out.srt

What if we want to get a different subtitle? First we need to figure out the track number for that subtitle by running ffmpeg -i input_file. You will get an output that something like this:

Stream #0:2(eng): Subtitle: subrip (default)
  title           : English-SRT
Stream #0:3(eng): Subtitle: hdmv_pgs_subtitle
  title           : English-PGS
Stream #0:3(chi): Subtitle: hdmv_pgs_subtitle
  title           : Chinese-PGS

Notice the tracks are numbered #0:2, #0:3, etc. This is the value we want to pass over to the map command to select the proper subtitle.

ffmpeg -i input_file out.srt  -map 0:3

Removing HTML from subtitles

There are a number of applications and online services that can strip html tags from subtitles (like HTML Stripper), but you can also solve this quickly with good old sed:

sed -e 's/<[^>]*>//g' subs.srt

You can make an alias to this in your favorite shell so you don’t have to remember or copy/paste it all the time. Keep in mind that the results of sed won’t be as good as a HTML stripper and company, since this is just a simple regex. That said, it’s usually more than enough for subtitles.

Converting to other formats

While FFmpeg can get this done for you with a limited number of formats (depending on how it was compiled) there is a better alternative.  SubtitleEdit is a handy open source application that can convert between 200+ subtitle formats (and do a lot more of course).

If you want to bring those subtitles into Blender check out the SubsImport addon.