accessibility and web captioning timing formats
November 20, 2013 8:24 AM   Subscribe

I'm managing a project right now, acting as the go-between for two vendors, and I'm having some trouble communicating the requirements for a particular captioning format. I'm hoping someone more experienced than I in the MeFi community can identify the name of the standard being used here so I can point to a clear reference for it.

Basically, I'm dealing with vendor A, who builds customer-facing web materials for us, and vendor B, a new vendor who I'm working with to get web captions prepared for our video materials. Vendor A recently switched to the Brightcove Accessibility Player to enable captioning support in multiple languages and add iOS device compatibility. As part of this switch, their requirements for caption files have changed and they now require that captions be offset with 'ticks' rather than hh:mm:ss, which is what we've traditionally supplied them with. Here are a few examples on Pastebin... format 1 is the old hh:mm:ss format, and format 2 is the new 'ticks' format.

So, I need to get vendor B to supply me with captions that match the new format. My question is: what is this format called? I feel like I'm swimming in abbrevations like DFXP, SMPTE-TT, TTML, etc., but nothing that nails down this particular format that I need to deliver, and I want to be able to tell vendor B exactly what I need and also hopefully point them to a reference for the standard.

(I'd also welcome any suggestions for software or scripts that can convert between these standards, whatever they are called, as we have a lot of old caption files that will eventually need to be converted.)
posted by Kosh to Computers & Internet (2 answers total)
 
Your new format appears to be a subset of TTML.

Other than adding the IDs, it looks like all you'll have to do to convert from your old format to the new one will be doing the math to convert your 'start' and 'duration' attributes into 'start' and 'end' attributes, and to sum up your hours and minutes and seconds into just seconds. (I'm not sure why the timing is different, e.g. "00:00:21.72" is converting to "21.5s" but I'm going to assume that's an artifact of the samples rather than the offsets and precision actually being different. You may want to check with Vendor A how many digits after the decimal point they'll support...)

If your old format is a well-known standard (it may be, but I can't name it) then you may be able to find a prewritten conversion script to transform between the two, but it's a pretty simple conversion; I think most people would just write a one-off perl script or the like.
posted by ook at 8:51 AM on November 20, 2013


You're right to be confused. DFXP is the old name for TTML. They both look like variants on old SMIL SGML data formats.

It doesn't look impossibly hard to convert. Are the decimals in each format decimals, or video frames? How does the new format handle marking times beyond seconds? Do you know if the id attributes need to be unique for all files, or just the current one?
posted by scruss at 8:51 AM on November 20, 2013


« Older HR Filter: I think I have an employee conducting...   |   How to make an appointment with a UK GP for mental... Newer »
This thread is closed to new comments.