Why does "primo piano" = "Association of Bologna"?
August 15, 2007 6:34 AM   Subscribe

Why does Babelfish translate "primo piano" in Italian as "Association of Bologna" in English?

Check out these presumably auto-translated pages, for example. You can also confirm it for yourself at BabelFish.

I found some Italian pages noting the issue with apparent amusement, but my Italian isn't good enough to read all the comments there... maybe they explain it?

Obviously "it's just a bug" is one possibility, probably the most likely, but if anyone can offer a plausible alternative explanation I'd love to hear it. (Could it be like those fake words in dictionaries apparently used to identify bootleggers?)
posted by No-sword to Writing & Language (4 answers total) 1 user marked this as a favorite
 
My Italian's not great, but the author of the page you linked to with "Italian pages" seems to think it's an easter egg, and I tend to agree. While it could be a bug, of course, I think it's unlikely given that, as the author of that page points out, the individual words "primo" and "piano" translate fine. It seems far more likely that someone responsible for the underlying database or translation code thought that was funny.
posted by cerebus19 at 6:51 AM on August 15, 2007


The comments point out that Google's translations do the same thing, so it's a problem with the Systran base, not Babelfish-specific.
They also address the problem of computer translators determining whether words are proper names or common nouns, and suggest that maybe "primo piano" is a proper name for some association in Bologna. This is refuted by one of the other commenters, who points out that if that were the case, one of the first search results for "primo piano" would turn up that association -- which it doesn't.
[Also, "primi piani" (first floors, plural) translates the same as "primo piano" -- Association of Bologna].
posted by katemonster at 9:18 AM on August 15, 2007


I'm inclined to suspect it's an artifact of the particular pair of parallel corpora they might have used to automatically generate the translation base. I don't know whether Systran uses parallel corpora, but it is fairly likely. Normally they would have large amounts of text in both English and Italian, and by aligning the sentences and phrases, they would infer what are the equivalents between the two bodies of text, and make a dictionary from there. But supposing there were a little glitch with the alignment, and further, that there were only one occurrence each of "primo piano" and "association of bologna" -- so maybe in italian it said:


Associazione di Bologna
Primo Piano
132 via Tessitura
01125 Bologna

and supposing that in English the translator left out the 'first floor' part, giving the address as:

Association of Bologna
132 via Tessitura
01125 Bologna

Then the sentence aligner gets confused and, in the absence of any other evidence of "primo piano" or "associazione di bologna", makes its best guess; then the lexicon is generated with those two as equivalents; and poof, there goes your first floor.

That's the sort of thing that might account for it.
posted by xueexueg at 10:37 AM on August 15, 2007


Thanks all!
posted by No-sword at 3:48 PM on August 15, 2007


« Older Can't find an inexpensive cd club containing some...   |   Looking for paradise in Thailand Newer »
This thread is closed to new comments.