Extract subtitles from MP4
January 21, 2022 9:58 AM Subscribe
A client has sent me a 20-minute mp4 video with subtitles they would like proofread. For "reasons" they are unable to send me a text file, but I feel sure there must be some way to extract the subtitles.
I've had a look at Wondershare and VLC, but I don't see any way to do it. Perhaps there isn't, but I thought if anyone knew, they'd be on MetaFilter! I'm using a Mac.
I've had a look at Wondershare and VLC, but I don't see any way to do it. Perhaps there isn't, but I thought if anyone knew, they'd be on MetaFilter! I'm using a Mac.
I was about to suggest ffmpeg. These tips may be helpful as you grapple with ffmpeg in case it is your first time.
posted by brainwane at 10:04 AM on January 21, 2022 [3 favorites]
posted by brainwane at 10:04 AM on January 21, 2022 [3 favorites]
Sorry, just realized that my two questions had opposite answers. If the subtitles can be turned off in VLC, they're not hardcoded (they're embedded), and the ffmpeg solution should work.
posted by supercres at 11:42 AM on January 21, 2022
posted by supercres at 11:42 AM on January 21, 2022
Subtitle Edit is fairly easy to use. It can also "read" hardcoded subs.
I don't know if it takes your specific video file format but I guess mp4 shouldn't be a problem.
Just start the program, drag your file into the upper left window and it'll do it's thing without much trouble.
posted by Kosmob0t at 12:46 PM on January 21, 2022 [4 favorites]
I don't know if it takes your specific video file format but I guess mp4 shouldn't be a problem.
Just start the program, drag your file into the upper left window and it'll do it's thing without much trouble.
posted by Kosmob0t at 12:46 PM on January 21, 2022 [4 favorites]
second Subtitle Edit. if the subtitles are text-based, you'll see them there.
if not, SE offers OCR for image-based subs, but it's dubious at best in my experience.
if you want to use ffmpeg, this ought to do the trick:
posted by neckro23 at 2:59 PM on January 21, 2022 [1 favorite]
if not, SE offers OCR for image-based subs, but it's dubious at best in my experience.
if you want to use ffmpeg, this ought to do the trick:
ffmpeg -i somefile.mp4 subtitle.srtif there are any text-based subtitles it'll convert them to .srt (an easy-to-read text format) and output them. otherwise it'll throw an error.
posted by neckro23 at 2:59 PM on January 21, 2022 [1 favorite]
oh and to be clear by "text-based subtitles" I mean subs that are embedded into the video file as a stream of text data. the other possibilities are image-based subtitles (.idx/.sub, usually from DVD or Blu-Ray) or subs that are just burned into the video.
I suspect they're trying to pull the last one on you though.
posted by neckro23 at 3:03 PM on January 21, 2022 [1 favorite]
I suspect they're trying to pull the last one on you though.
posted by neckro23 at 3:03 PM on January 21, 2022 [1 favorite]
If the subtitles are 'burned' into the actual video.... those are called "hard subs". If they are in a text-like file in the media container (just like there's a video stream, possibly multiple audio streams, possibly multiple subtitle streams)... those are called "soft subs".
posted by zengargoyle at 3:21 PM on January 21, 2022 [1 favorite]
posted by zengargoyle at 3:21 PM on January 21, 2022 [1 favorite]
If the subs are not extractable and you have to retype them yourself, don't be tempted to think it'll just take a few extra minutes of work and therefore not be worth charging more for. It can take longer than you'd think and there should definitely be a transcription surcharge.
posted by trig at 3:54 PM on January 21, 2022 [1 favorite]
posted by trig at 3:54 PM on January 21, 2022 [1 favorite]
« Older My students are making me nuts | What's a productive way to think about problematic... Newer »
This thread is closed to new comments.
posted by supercres at 10:01 AM on January 21, 2022 [4 favorites]