<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Japanese PHP/Mysql development tips?</title>
	<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips/</link>
	<description>Comments on Ask MetaFilter post Japanese PHP/Mysql development tips?</description>
	<pubDate>Mon, 28 Aug 2006 06:46:05 -0800</pubDate>
	<lastBuildDate>Mon, 28 Aug 2006 06:46:05 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Japanese PHP/Mysql development tips?</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips</link>	
		<description>About to create a PHP/MySQL CMS and web site for a client entirely in Japanese, what do I need to know?  &lt;br /&gt;&lt;br /&gt; The site is essentially a japanese version of an existing ecommerce site with the shopping removed (  it&apos;s a product catalog without any ecom ).  Due to circumstances beyond my control, I can&apos;t reuse the same code the ecom site uses.   &lt;br&gt;
&lt;br&gt;
I do not speak japanese,  but the client will be providing a doc with english and japanese for everything on the site, and will be entering the the product info themselves using the CMS.  &lt;br&gt;
&lt;br&gt;
I usually roll my own CMS for sites of this size, but I&apos;m considering some templating systems or maybe one of those systems that enforces MVC that the kids are so crazy about nowadays (Symfony looks interesting).  I don&apos;t know that that makes a difference with my question, but maybe there&apos;s something I&apos;m not considering.&lt;br&gt;
&lt;br&gt;
I&apos;ve been doing some reading, and I&apos;m pretty overwhelmed by all the character set discussion.   I hadn&apos;t really expected there to be more than half a dozen options.  Performance is not really a concern since this site is going to be small and low traffic, the primary concern is ease of development and that it works consistently. &lt;br&gt;
&lt;br&gt;
So, long preamble done, my questions:&lt;br&gt;
1. From what I&apos;ve found so far, it sounds like UTF-8 is the character set I should go for.  Is this correct? Should I look into other encodings? &lt;br&gt;
&lt;br&gt;
2. MYSQL.  According to the docs, if I&apos;m using MySQL 4.1 or greater, I can simply set a field to UTF-8 encoding like so: ALTER TABLE myTable MODIFY myColumn VARCHAR(255) CHARACTER SET utf8;.  Anything else I need to do on the mysql end?&lt;br&gt;
&lt;br&gt;
3. PHP &amp;amp; HTML.  I&apos;m less clear how to get the data from a form field into UTF-8 and send it off to MySQL.  On a whim, I did a test, and noticed by default IE and Firefox already do a different encoding (the data from FF ending up in the database looking like this -- &amp;amp; #12506; &amp;amp; #12540; (w/o spaces), and IE&apos;s looked like this --  &#227;&#402;&#169;).   Presumably I need to set the headers? Is there something I need to put in the FORM tag (does it need to be multipart?).  When dealing with the submitted data can I safely just grab the $_REQUEST value and send it off to the database, or is there some transformation I need to do?  Similarly, is there anything I need to do with data I have retrieved from the database before displaying it?&lt;br&gt;
&lt;br&gt;
Thanks in advance for any advice.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2006:site.45376</guid>
		<pubDate>Mon, 28 Aug 2006 06:27:52 -0800</pubDate>
		<dc:creator>malphigian</dc:creator>
		
			<category>php</category>
		
			<category>mysql</category>
		
			<category>japanese</category>
		
			<category>translation</category>
		
	</item> <item>
		<title>By: scottreynen</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#693912</link>	
		<description>1) Yes, whenever possible, use UTF-8.&lt;br&gt;
2) While it&apos;s best to set the fields to UTF-8, it&apos;s not particularly important because UTF-8 can be temporarily stored in any ASCII-compatible character set (e.g. the MySQL default) without data loss. As long as you treat it like UTF-8 on the presentation end, how you store it just needs to be ASCII-compatible.&lt;br&gt;
3) Setting your HTML charset in a HEAD META tag to UTF-8 will ensure it&apos;s sent from browser to server as UTF-8. You don&apos;t need anything special in form elements. After it gets to the server, PHP treats everything as ASCII, which, as I said above, is a safe way to handle (though not display or transform) UTF-8.&lt;br&gt;
4) If you&apos;re doing any manipulation on Japanese text in UTF-8, you may find useful &lt;a href=&quot;http://www.randomchaos.com/documents/?source=php_and_unicode&quot;&gt;an article I wrote on converting UTF-8 to arrays of unicode code points and back&lt;/a&gt;.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-693912</guid>
		<pubDate>Mon, 28 Aug 2006 06:46:05 -0800</pubDate>
		<dc:creator>scottreynen</dc:creator>
	</item><item>
		<title>By: sbutler</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#693913</link>	
		<description>As to (3), the browser should submit the form with the same encoding that was used for the page. Are you sending your pages with UTF-8 encoding?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-693913</guid>
		<pubDate>Mon, 28 Aug 2006 06:46:30 -0800</pubDate>
		<dc:creator>sbutler</dc:creator>
	</item><item>
		<title>By: cillit bang</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#693932</link>	
		<description>It&apos;s best to have PHP just treat UTF-8 as ASCII, and not try to use any built-in Unicode support (which it doesn&apos;t really have). You might want to extend this policy to MySQL as well. Read &lt;a href=&quot;http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF&quot;&gt;this article&lt;/a&gt; and keep tabs on which encoding is which yourself.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-693932</guid>
		<pubDate>Mon, 28 Aug 2006 07:05:24 -0800</pubDate>
		<dc:creator>cillit bang</dc:creator>
	</item><item>
		<title>By: Bugbread</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#693938</link>	
		<description>Just joining the chorus:&lt;br&gt;
&lt;br&gt;
If your page is rendered in UTF-8, info goes in and comes out just fine.&lt;br&gt;
&lt;br&gt;
If you don&apos;t set the charset correctly in the MySQL table, the information will &lt;i&gt;still&lt;/i&gt; go in and come out just fine, &lt;i&gt;but&lt;/i&gt; if you look at the data in the table directly (as opposed to via the browser), it will be garbled.  From the end-user point of view, everything will be working fine, but from the point of someone poking around the MySQL tables, it&apos;ll be unintelligible.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-693938</guid>
		<pubDate>Mon, 28 Aug 2006 07:10:39 -0800</pubDate>
		<dc:creator>Bugbread</dc:creator>
	</item><item>
		<title>By: adamrice</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#694203</link>	
		<description>I agree with what has been said before.&lt;br&gt;
&lt;br&gt;
Do you plan on implementing search? Japanese does not separate words with spaces (moreover, there&apos;s a &quot;Japanese space&quot; that occupies a different code point than the ASCII space), and conventional search algorithms that work in word-chunks will not work.&lt;br&gt;
&lt;br&gt;
The easy solution is to find the search string anywhere in the target string. The fancy solution is to hook into something like &lt;a href=&quot;namazu&quot;&gt;namazu&lt;/a&gt;, which detects word boundaries for you.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-694203</guid>
		<pubDate>Mon, 28 Aug 2006 10:46:14 -0800</pubDate>
		<dc:creator>adamrice</dc:creator>
	</item><item>
		<title>By: Bugbread</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#694363</link>	
		<description>Adamrice: Your link for namazu is borked.  Perhaps you meant www.namazu.org?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-694363</guid>
		<pubDate>Mon, 28 Aug 2006 12:34:15 -0800</pubDate>
		<dc:creator>Bugbread</dc:creator>
	</item><item>
		<title>By: adamrice</title>
		<link>http://ask.metafilter.com/45376/Japanese-PHPMysql-development-tips#694421</link>	
		<description>urp, yeah, thanks for catching that.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.45376-694421</guid>
		<pubDate>Mon, 28 Aug 2006 13:18:05 -0800</pubDate>
		<dc:creator>adamrice</dc:creator>
	</item>
	</channel>
</rss>
