Extracting subtitles from recorded Australian TV?
January 10, 2011 7:17 PM   Subscribe

Can I extract subtitles from TV shows that I record on a PVR into an *.srt file? I'm recording TV in Australia; my PVR records into files called *.ts, which I believe might be mpeg2, but also contain the subtitles that are broadcast. I've been compressing the recordings into avi files with Freemake, but then I lose the subtitles. Is there an easy to use program that will take a *.ts file and output an avi and and srt file? I've tried Subrip, but it didn't recognise the *.ts file. Thanks!
posted by surenoproblem to Computers & Internet (4 answers total)
 
Hang on, on a re-read, perhaps I'm totally wrong. What do you mean when you say you 'lose the subtitles'? Are they just illegible, or do they actually disappear?
posted by pompomtom at 8:34 PM on January 10, 2011


Many apologies.

This thread seems to suggest that a util called ProjectX can be used for it, and that the subtitles are indeed embedded in the stream, but separate from the picture.
posted by pompomtom at 8:40 PM on January 10, 2011


...and if you follow this howto, stopping before the end, you should have a .srt file.
posted by pompomtom at 8:50 PM on January 10, 2011


Best answer: Can be done (I do it all the time). The trick is that here in Aus, subtitles (apart from SBS's 'burned' english-language subs on foreign content) are transmitted as text in a private stream (titled after the old teletext codepage used for subtitles - 801) in the MPEG-2 transport stream.

Most (but not all) Digital PVRs record the whole transport stream to disk. ProjectX can be set up to extract the streams individually (video, audio, teletext) from the recorded file. Here's a quick example of a report from a transport stream recorded on my old Topfield…

+> Input File 0: '/Users/nameredacted/Desktop/Review With Myles Bar_00.12.22.rec' (2,056,701,952 bytes)
-> Filetype is TS (generic PES Container)
-> demux
-> Service ID 0x0241
-> PMT 0x0100 refers to these usable streams:
Video:
PID: 0x0200(#1)
Audio:
PID: 0x028A(#2){eng}
Teletext:
PID: 0x0240(#3)(eng_s801 )
Subpict.:
n/a


(Please don't tell the ABC I recorded that and plan to burn it to DVD for my personal collection ;-).

So, the process is:
  1. In ProjectX, under "Pre-settings -> Subtitles", check one of the Unicode output options (e.g. UTF-8 or UTF-16); under "teletext pages to decode", choose 801; and choose your subtitle export format (SRT is the only one I've found reliable across all channels)
  2. Demux your recording into video (MPEG-2), audio (usually MP2 for SD), and subtitles (the format chosen above)
  3. Re-encode & mux the audio & video into whatever format and container you want
  4. Rename the subtitle file to suit & drop it alongside the re-encoded/remuxed video.
That's the gist of it anyway. If you edit the video file at all (e.g. cut ads, trim the top / tail, add black to the beginning/end, etc), you'll have to redo the timings in the subtitle file. I don't know of a good video editor that also handles subtitle editing - so I normally don't bother with subtitles on stuff I've recorded from the commercial channels. I've got a PHP script I knocked up to do subtitle checking (dupes, shorts, overlaps) and retiming (based on what I cut off the beginning/end of the demuxed video / audio, with an allowance for 1 second of black I edit in at either end) for the ABC channels.
posted by Pinback at 9:10 PM on January 10, 2011


« Older I can't remember the title of a book of serial...   |   Where can I find a gift that combines the ideas of... Newer »
This thread is closed to new comments.