Help me to hack Google Translate
April 14, 2011 4:50 PM   Subscribe

How to get Google Translate to translate only parts of a Word document?

I want to translate the subtitles of a movie and I'd like to use Google Translate to speed up the process. Thing is, I have to keep the formatting and coding of the original text made in Microsoft Word, that looks like this:

____________________________________

1
00:01:06,015 --> 00:01:07,346

Ordine.

2
00:01:07,517 --> 00:01:09,542

Ordine in aula.

3
00:01:09,853 --> 00:01:12,651
La corte sospende per 15 minuti.


4
00:01:16,659 --> 00:01:18,092

Quel Martin, perfetto come assolto.

5
00:01:18,261 --> 00:01:20,729

0:Quel giudice deve fare attenzione
1:alla gente di Loder.

6
00:01:20,897 --> 00:01:22,364

Sono dei cattivi ragazzi per giocarci insieme.

7
00:01:22,532 --> 00:01:25,262

0:Ehi, signora, scommetto 5 a 2
1:che il Giudice Shaw lascer, andare Martin.

8
00:01:25,435 --> 00:01:26,993

0:- Smamma.
1:- Oh, aspetti un attimo.

{{NOTE: IGNORE PUNCT}}
9
00:01:27,170 --> 00:01:28,728

Lasciamo perdere.

10
00:01:28,905 --> 00:01:31,669

0:Scommetto 10 a 1 che il giudice
1:Shaw lascier, andare Martin libero.

{{NOTE: BURNT TEXT}}

11
00:01:31,841 --> 00:01:35,743

0:- Fai 100 a 1 e ci sto.
1:- Oh, sei un professionista?

12
00:01:35,912 --> 00:01:37,277

Ragazzo, questa è una cosa interessante.

13
00:01:37,447 --> 00:01:40,814

0:Qui c'è un tipo che si muove nel fisco cittadino,
1:arraffa per sè 350.000 bigliettoni...

{{NOTE: ITALICS}}

14
00:01:40,984 --> 00:01:43,782

0:... e non c'è un giudice in cittá
1:che gli dia quello che si merita.

____________________________________

I have to keep everything but the text. I have to keep all that is red (in bold above) and those numbers at the beginning of the sentences, that indicate line 0 and line 1.

The formatting is important (the colour of the text), so are the codes in words that Google Translate will try to translate.

Is there a way to do that?

Thanks!
posted by TheGoodBlood to Computers & Internet (7 answers total)
 
It might be possible using some combination of regex and diffs, but I'm afraid I have no clue beyond that.

Although, given you'll have to rewrite pretty much all the google output, why not machine translate it without preserving the formatting and just have it side to side with your working translation as you write it?
posted by turkeyphant at 5:22 PM on April 14, 2011


You could pretty much automate a first draft with Perl and a command-line Google Translate script.
posted by rhizome at 5:44 PM on April 14, 2011 [1 favorite]


Response by poster: Thanks guys. I guess I should have mentioned that I am fully illiterate with programming and code. Turqueyphant, I guess I'll do that.
posted by TheGoodBlood at 7:12 PM on April 14, 2011


Google translate will ignore numbers, so copy and translate the whole thing.

To replace the formatting use Find/replace
For example you can search for any Number (with ^#) and replace it with the same number (^&) + formatting (or even a style)

Using Find/Replace to apply lots of formatting will probably bloat the document size with lots of formatting codes but a few find/replace cycles should get you pretty close to what you need.
posted by Lanark at 2:43 AM on April 15, 2011


Response by poster: Lanark, thanks! How can I use Find/Replace to get the font in red again?
posted by TheGoodBlood at 6:28 AM on April 15, 2011


In the Find/Replace dialogue box, click in the replace box and select Format | Font...
posted by Lanark at 10:21 AM on April 15, 2011


Response by poster: Yeah, that wouldn't work. Google Translate makes my text all black. Making F&R not useful. Unless there's something I'm not thinking.
posted by TheGoodBlood at 11:27 AM on April 16, 2011


« Older OSX / Little Snitch Internet problem!   |   Does the 2010 version of TurboTax Premier really... Newer »
This thread is closed to new comments.