Strange language in website
October 7, 2008 1:16 PM   Subscribe

What language is the text on this page? If it is and (probably it is judging from the URL) an anagram, what language is it derived from? Strange language
posted by keijo to Writing & Language (9 answers total) 2 users marked this as a favorite
Fairly certain that image is from the Voynich Manuscript which is now widely held to be a sophisticated forgery and not in any particular language. Much more here. Their conclusion is ""The Voynich manuscript is a relatively recent invention, containing a phoney text, forged with a numerical filling system."
posted by jessamyn at 1:24 PM on October 7, 2008

I think it's anagrammatic English nonsense.

Poking around on the same site, you can find this:

which is a 9 MB file of all the phrases in the page you linked to, with their anagrams, which in turn are nonsense.
posted by i_am_joe's_spleen at 1:25 PM on October 7, 2008

Also did you see this page in that directory?
posted by jessamyn at 1:26 PM on October 7, 2008

All starting to make sense. Thank you.
posted by keijo at 1:27 PM on October 7, 2008

It appears to be an agglomeration of pseudolinguistic elements that pivot from the query. That it drops in references to the Voynich manuscript might be taken as a clue to what it means.
posted by holgate at 1:38 PM on October 7, 2008

which is a 9 MB file of all the phrases in the page you linked to, with their anagrams, which in turn are nonsense.

Actually it's a record of all the short phrase "translations" performed over some period of time by people trying to figure it out. If you wait for it to load, your recent phrase will be at/near the bottom.

All starting to make sense.

posted by beagle at 1:39 PM on October 7, 2008

I think it's nonsense too, but not because of the text in PARATRANSLATE.html, which as beagle points out, is just what people have typed into the "translate short phrase" box and what comes back (which are anagrams). I think PARATRANSLATE.html only contains phrases from these documents because that's what people are cutting and pasting into the translate box.

The pages seem to be generated on the fly by the PARALINGUA.cgi script. Here's some clues:
- Searching in the "search" box always returns a page with your term in the title. Likewise, the wiki-like links (e.g., can be modified by hand, and any article comes up.

Search for "Sarah Palin is not very smart"
Article title in URL modified by hand
posted by DevilsAdvocate at 3:26 PM on October 7, 2008

I don't think I deserve Best Answer here, although I'll take Best Leap To Conclusions.
posted by i_am_joe's_spleen at 3:28 PM on October 7, 2008

Some more findings from playing around:

Searching for a string gives a different page than setting the article title to that string.

Appending one or more instances of a to your search term or article title returns the same article, with the exception that the search/title term is also modified where it appears in the article.

Searching repeatedly for the same term gives the same page except for the page title. (Not the headline within the article, but the actual HTML title, which appears in the title bar of the browser.)

Strangely, three or more consecutive occurrences of a in your search term or article title are reduced when they appear in the article, with every full three a's becoming two. E.g., baaacd becomes baacd, (3 becomes 2), raaaat becomes raaat (4 becomes 3), daaaaaaaaa becomes daaaaaa (9 becomes 6). In other words, consecutive a's are reduced to 2/3 the original number of a's, rounded up. However, words that appear the same do not generate the same page. searching on baaacd generates a different page than searching on baacd, even though they appear the same in the page text.

Consecutive e's or o's are reduced in the same way as consecutive a's. (Although only the same letter repeated, not mixed letters.) Consecutive i's are reduced to 1/3 their original number, rounded up. Consecutive u's are reduced to a single u if the number of consecutive u's is not divisible by 3, and eliminated entirely if it is. Consecutive y's are reduced to 1/2 their original number, rounded up. Repetitions of other letters do not seem to be affected.

The HTML source does not appear to give any clues except for one thing. An HTML comment that looks like this for searches:

<!-- SEARCH=g article=G SEARCH1=G USEED=6 -->

and this for when you specify the "article" in the URL:

<!-- ARTICLE=g 6 G -->

(In both of the above cases the term was "g".) The number where "6" appears above changes depending on the search term. When searching a single letter, the number appears to range from 0 (for a) to 25 (for z). For longer words, the number is apparently generated by taking the values for each letter and multiplying by successive powers of 4, increasing towards the right. e.g., for "cats" the number 1458 appears, and 1458=2 + 0*4 + 19*42 + 18*43.

You can even see when this algorithm goes beyond the integer range on my second linked page in my previous comment: "supercalifragilisticexpialidocious" produces, literally, "1.77257608893777e+21"

Non-letter characters appear to be ignored.

I suspect the number thus generated is used as a seed for a pseudo-random number generator used to produce the page: two words that generate the same number ("blob" and "nice" both result in 333) return the same page, except for where the search/title term appears within the page. (This also explains why appending a's does not change the page except for the term itself, as appending a's does not alter the number generated by this algorithm).

I'd be curious to see the algorithm which generates these pages--clearly the words aren't just some random assortment of letters, as vowels and consonants are mixed in the right way to appear like actual words. Perhaps some kind of Markov chain at the letter level? (If so, is it based on an actual language? Which one?)
posted by DevilsAdvocate at 4:35 PM on October 7, 2008 [2 favorites]

« Older One foot on the "right" side of the track, one...   |   Cheap travel from Delaware to NYC? Newer »
This thread is closed to new comments.