How many Chinese characters are shared between Mandarin and Japanese?
February 21, 2011 7:23 AM   Subscribe

How many Chinese characters are shared 1:1 between Mandarin and Japanese?

I've been studying the kanji for quite a few months now, and having a great time of it. But what I noticed was that many of the Kanji were retained as they were in Chinese, while some were simplified (were any invented in Japan?), and this got me thinking: just how many kanji are present that exist 1:1 in Mandarin and Japanese?

For eg.
转 - present in Chinese, not in Japanese
车/車 - simplified in Chinese, not in Japanese
冰/氷 - simplified in Japanese, not in Chinese
团/団 - simplified differently in either language


木 - present exactly the same in both Chinese and Japanese

Has anyone composed a study of exactly how many characters are shared like in the final example?
posted by Senza Volto to Writing & Language (12 answers total) 4 users marked this as a favorite
I don't have an answer, but it also depends which "Chinese" you are looking at. Mainland China uses simplified characters; Taiwan and Hong Kong do not. In 1949 Chinese characters used in Mainland China were simplified to what they are today. The kanji come from the unsimplified characters which are still in use outside of Mainland China. Also, some characters were not simplified and are still used in their original form (in Mainland China).
posted by bearette at 7:27 AM on February 21, 2011

were any invented in Japan? Yes, some were. Here is one, which is read as "tochi": 杤. It is different from the tochi in Tochigi(栃木), which I think (?) is used in Chinese, albeit very rarely.

There are famous examples of other Japanese inventions, which do not jump to mind, but I am sure someone else will come along and add.

I do not at all mean this in a snarky way, but I bet you would find interesting the Wikipedia articles on the simplified characters and on kanji, as well as the citations in those articles.

The WWWJDIC is another good place to learn about characters. 
posted by vincele at 7:52 AM on February 21, 2011

First, Mandarin refers to the official spoken dialect, so it doesn't really make sense when talking about written characters. As bearette said, Simplified Chinese is only used by the mainland and was instituted after the Civil War. So naturally kanji wouldn't have There's really only a few hundred simplified characters, though there's also some simplified radicals, which generalize to more characters.

There's also the simplification done to kanji in Japan post World War II, but I'm not as familiar with that. It seems there also were some kanji invented in Japan.

Keep in mind that kanji and their corresponding Chinese characters can also diverge in meaning.

As for the actual answer to your question, I'm not finding any concrete answers but I would look at Wikipedia and the sci.lang.japan FAQ for further guidance.
posted by kmz at 7:56 AM on February 21, 2011

Leaving aside the question of simplified characters versus non-simplified, there is a set of characters that is definitively Japanese, made in Japan, and not shared by any other kanji-based language: the kokuji (国字)。

According to the Japanese description on Wikipedia, there are generally two types of kokuji: those made in Japan in years long past, and those created relatively recently, to respond to the influx of new ideas and products from modern Western cultures.

Some of the first category are:


And some of the second category are:

膵(スイ)・腺(セン)・腟 (チツ、本来はシツ)・瓩(キログラム)・鞄(かばん)

I'm sure that this list is not definitive; there are probably more resources on the Internet.

This doesn't answer your question on 1:1 correspondances, of course, but it does give you a set of characters that can absolutely be eliminated from your search.
posted by Gordion Knott at 7:57 AM on February 21, 2011

There are a few hundred 国字, kanji invented in Japan. Most are very obscure.

I wouldn't get too worked up over different character simplifications between Japan and the PRC. The same characters still exist in their pre-simplified form. And there are so many variants on characters (eg 高 and 髙), even without taking simplifications into account, that it would be hard to get an accurate count of how many distinct characters are used in either language.
posted by adamrice at 7:57 AM on February 21, 2011

Response by poster: I see, thank you all for your answers enlightening me about the kokuji.

By the way, I'm sorry I mentioned Mandarin, I meant to say Simplified Mandarin, as written in Mainland China.

I just find it strange that no one has done this before. It came to my mind when I realised that there are so many characters I've learned that I can use to decipher Chinese sentences too (not read them obviously).
Clearly, for the foreign learner, studying the characters in one language serves as a stepping stone to learning the other language and a lot of people have done this (learning Japanese after Chinese, or Chinese after Japanese).
That is why I find it surprising that I can't seem to find any ready statistics about the number of directly shared characters between the two languages.
posted by Senza Volto at 9:07 AM on February 21, 2011

This does not at all answer your question but I always thought it pretty cool that the board game go is written:

圍棋 in traditional Chinese;
围棋 in simplified Chinese;
囲碁 in Japanese.

So that's three different ways to write the first character and two ways to write the second. Neither character is that obscure.
posted by pewpew at 9:38 AM on February 21, 2011

I don't think it's possible to get your question answered, for the simple reason that it's impossible to count the total number of Chinese characters that exist.

You can count the number of characters that China has simplified and Japan has not (车/車); that Japan has and China has not (冰/氷); and that China and Japan have simplified, whether it's identical or differently (湾/湾, 团/団), because these are all officially defined sets: both mainland China and Japan have a finite number of characters that they have officially simplified, so you can count those and the overlap between those.

But the remainder (characters shared between both, such as 北), which is what you are looking for, is essentially all remaining characters in use in Chinese and Japanese, which is not something that we can count. Are characters only used in Japanese placenames "in use" in Japanese? What about a Chinese character only found in Buddhist sutras? If you limit your question to, say, the jōyō kanji of Japanese, then you could certainly get a statistic. But if not, then it's essentially impossible to answer your question because it requires answering the question of "how many Chinese characters exist."

(Not to say we couldn't get a useful estimate, but it still requires arbitrarily estimating how many Chinese characters are in use in Chinese and in Japanese.)
posted by andrewesque at 1:02 PM on February 21, 2011

I just wanted point out that 转, or 轉, is simplified in Japanese as 転. In all the countries that use characters, the sets used today are the result of a long, sometimes convoluted, process of selection, deletion, and adaptation. I think this question would be a good one for a reference librarian or, if you are affiliated with an institution that has one, a librarian that specializes in East Asian studies.
posted by mustard seeds at 2:57 PM on February 21, 2011

Also note that even some characters that have not been "officially" simplified in either country and are now assigned the same Unicode codepoint are, nevertheless, arguably different in that if they are not displayed in the correct language's font, they look wrong. (The details tend to be minor, like stroke angles and so on, but are still significant to native speakers/readers.) Google "Han unification" for more on this; it's a big (the main?) reason for resistance to Unicode in East Asia.

So, yes, I agree that you're going to have to define your terms for this question to make sense. One way to automatically find an answer would be to choose a Japan-specific text encoding J and a China-specific text encoding C and then see how many cases there are where at least one character from both J and C maps to the same Unicode codepoint using your encoding-conversion tool of choice. This completely ignores the issue I described in my first paragraph, and is limited by the specifics of your chosen character sets and encoding tool, but it's an experiment you can do without being a Han specialist.
posted by No-sword at 3:52 PM on February 21, 2011

Best answer: Oh, or, to up the meaningfulness quotient slightly, you could use subsets of C and J corresponding to whatever "official" lists you want to choose. I suppose for Japan this would be the Joyo list, and no doubt the Chinese government maintains something similar. This will give you even less coverage of the cobwebbed corners of the issue (proper nouns and so on), but, in exchange, your results will be more closely linked to everyday usage.
posted by No-sword at 3:59 PM on February 21, 2011

Response by poster: Sorry for the late response, and thank you all for your responses.

I should have mentioned that this question was intended to find out how many new characters would have to be learned should I decide to study Mandarin later. Therefore, the "sets" of characters on either language would be an official list, as No-sword pointed out.

I think I might have a go at figuring this out sometime.
posted by Senza Volto at 5:22 AM on February 23, 2011

« Older I can haz appointment?   |   Contacting retailers and restaurants Newer »
This thread is closed to new comments.