<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Change Cyrillic to Arabic, in Kazak</title>
	<link>http://ask.metafilter.com/60819/Change-Cyrillic-to-Arabic-in-Kazak/</link>
	<description>Comments on Ask MetaFilter post Change Cyrillic to Arabic, in Kazak</description>
	<pubDate>Wed, 18 Apr 2007 00:23:45 -0800</pubDate>
	<lastBuildDate>Wed, 18 Apr 2007 00:23:45 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Change Cyrillic to Arabic, in Kazak</title>
		<link>http://ask.metafilter.com/60819/Change-Cyrillic-to-Arabic-in-Kazak</link>	
		<description>I use Mellel for writing in Kazak on OS X. I have files in Cyrillic which I need to transliterate into their Arabic alphabet. It isn&apos;t a straight letter-letter deal though. How do I correctly find and replace so that &#1199;&#1085; becomes &#1652;&#1735;&#1606;, not &#1735;&#1606;, ,for example. (In other words, &quot;if &#1241;,&#1257;,&#1199;,&#1110;, but no &#1075;,&#1082;,&#1077;, add &#1652; to front of word&quot;)  &lt;br /&gt;&lt;br /&gt; In Kazak Arabic, the letters &#1575;&#1548; &#1608;&#1548; &#1735;&#1548; &#1609; (a, o, u, i) can represent both the hard (&#1072;, &#1086;, &#1201;, &#1099;) and soft (&#1241;, &#1257;, &#1199;, &#1110;) Cyrillic vowels of the Cyrillic alphabet. If there is a &#1075;, &#1082;, &#1077; (g, k, e) it is understood that the vowels are soft since hard vowels don&apos;t appear with those 3 consonants (only in rare cases). This is why a straight forward letter-to-letter find and replace won&apos;t work!</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2007:site.60819</guid>
		<pubDate>Wed, 18 Apr 2007 00:11:01 -0800</pubDate>
		<dc:creator>steppe</dc:creator>
		
			<category>Kazak</category>
		
	</item> <item>
		<title>By: Blazecock Pileon</title>
		<link>http://ask.metafilter.com/60819/Change-Cyrillic-to-Arabic-in-Kazak#915955</link>	
		<description>Mellel supports &lt;a href=&quot;http://www.regular-expressions.info/tutorial.html&quot;&gt;regular expressions&lt;/a&gt;, and if it supports Unicode regex, then you could look into this &lt;a href=&quot;http://www.regular-expressions.info/unicode.html&quot;&gt;Unicode regular expressions overview&lt;/a&gt; and &lt;a href=&quot;http://www.regular-expressions.info/conditional.html&quot;&gt;regex conditional overview&lt;/a&gt; to find out how test for a match a particular combination pattern, and replace it with the characters you want.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.60819-915955</guid>
		<pubDate>Wed, 18 Apr 2007 00:23:45 -0800</pubDate>
		<dc:creator>Blazecock Pileon</dc:creator>
	</item><item>
		<title>By: Aidan Kehoe</title>
		<link>http://ask.metafilter.com/60819/Change-Cyrillic-to-Arabic-in-Kazak#915991</link>	
		<description>&lt;blockquote&gt;&lt;i&gt;If there is a &#1075;, &#1082;, &#1077; (g, k, e) it is understood that the vowels are soft since hard vowels don&apos;t appear with those 3 consonants (only in rare cases).&lt;/i&gt;&lt;/blockquote&gt; Do you mean, &lt;i&gt;if the word starts with &#1241;,&#1257;,&#1199;,&#1110; and the second letter is not one of &#1075;, &#1082;, &#1077;, add &#1652; to the start of the word? &lt;/i&gt;? If so, you can run the following Perl program on your text file: &lt;blockquote&gt;&lt;code&gt;perl -e &apos;use utf8; use encoding &quot;utf8&quot;; while (&lt;&gt;) { s/\b([\x{04d9}\x{04e9}\x{04af}\x{0456}][^\x{0433}\x{043a}\x{0435}])/\x{0674}\1/g; print; } &apos; &amp;lt; &lt;i&gt;original-file-name&lt;/i&gt; &amp;gt; &lt;i&gt;modified-file-name&lt;/i&gt; &lt;/&gt;&lt;/code&gt;&lt;/blockquote&gt; which will add the hamza where appropriate, and you can then do the global replace.  Note that any new lines were added by MetaFilter or your browser, so you shouldn&apos;t use them in Terminal, and you&apos;ll need to change &lt;i&gt;original-file-name&lt;/i&gt; and &lt;i&gt;modified-file-name&lt;/i&gt; to reflect what you have on your system.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.60819-915991</guid>
		<pubDate>Wed, 18 Apr 2007 02:40:44 -0800</pubDate>
		<dc:creator>Aidan Kehoe</dc:creator>
	</item><item>
		<title>By: steppe</title>
		<link>http://ask.metafilter.com/60819/Change-Cyrillic-to-Arabic-in-Kazak#932541</link>	
		<description>Thanks for your feedback. The complexity of the answers, at least to my eye, is what led me to post here !&lt;br&gt;
&lt;br&gt;
One response, IF those &quot;soft&quot; (&#1241;,&#1257;,&#1199;,&#1110; ) vowels appear anywhere, even once, in the word, and there isn&apos;t one of the three letters (&#1075;, &#1082;, &#1077;) anywhere in the word, in Cyrillic, THEN the Arabic word needs only one hamza at the very front (not over or near every instance of a vowel in the word, only the very front, once). If there is even one of those three letters, then it tells the reader that the vowel is a soft vowel (&#1241;,&#1257;,&#1199;,&#1110; ), not a hard vowel (&#1072;, &#1086;, &#1201;, &#1099;). Otherwise the hamza tells the reader the vowels are soft.&lt;br&gt;
&lt;br&gt;
What would that perl program look like now?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.60819-932541</guid>
		<pubDate>Fri, 04 May 2007 10:20:45 -0800</pubDate>
		<dc:creator>steppe</dc:creator>
	</item><item>
		<title>By: Aidan Kehoe</title>
		<link>http://ask.metafilter.com/60819/Change-Cyrillic-to-Arabic-in-Kazak#1092256</link>	
		<description>&lt;blockquote&gt;&lt;i&gt;What would that perl program look like now?&lt;/i&gt;&lt;/blockquote&gt; Like this:  &lt;blockquote&gt;&lt;code&gt;perl -e &apos;use utf8; use encoding &quot;utf8&quot;; while (&amp;lt;&amp;gt;) { s/\b([\x{04d9}\x{04e9}\x{04af}\x{0456}][^\x{0433}\x{043a}\x{0435}]+\b)/\x{0674}\1/g; print; } &apos; &amp;lt; &lt;i&gt;original-file-name&lt;/i&gt; &amp;gt; &lt;i&gt;modified-file-name&lt;/i&gt; &lt;/code&gt;&lt;/blockquote&gt;  Not tested, sorry. </description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.60819-1092256</guid>
		<pubDate>Tue, 09 Oct 2007 07:36:24 -0800</pubDate>
		<dc:creator>Aidan Kehoe</dc:creator>
	</item>
	</channel>
</rss>
