Unicode replacement characters
September 1, 2008 5:52 AM   Subscribe

Is there a name for the domino like box characters with 4 characters inside that display when you don't have support for a language script?

There a few examples on this page that show up for me. Bengal, Bhutan and Khmer for example show up as boxes with 4 characters inside. The first two characters are often in common for all the boxes within each language example.
In case it is system dependent - Windows XP, Firefox 3.0.1, English(US).
posted by srboisvert to Computers & Internet (7 answers total) 4 users marked this as a favorite
 
The Unicode spec seems to refer to these instances as "interpretable but unreadable characters." Excerpt:
An implementation may receive a code point that is assigned to a character in the Unicode character encoding, but be unable to render it because it lacks a font for the code point or is otherwise incapable of rendering it appropriately.
The spec leaves it up to the software to determine how it wants to render that.
posted by letourneau at 9:26 AM on September 1, 2008


Argh, excuse me, that should be "interpretable but unrenderable characters" in the first sentence of my previous comment.
posted by letourneau at 9:27 AM on September 1, 2008


Here's a document from Microsoft regarding OpenType fonts that refers to the "I don't know how to render this character" glyph as the ".notdef glyph" (which term apparently dates back to the TrueType specs). OK, I'll stop stacking up comments now.
posted by letourneau at 9:32 AM on September 1, 2008


yeah, I looked on the Unicode pages and didn't see anything like this so this appears to be a MS-specific thing. Your samples show different glyphs on my Mac, no "dominos".
posted by troy at 9:47 AM on September 1, 2008


"Undisplayable character glyph" is the best I can come up with.

It seems like there's no other standard name for it, since the rules don't specify exactly how to represent the fact that there's an undisplayable character. And nobody seems to have given it a colloquial name yet (like R-ball for ®). It's left up to the user agent (in this case, your web browser) to handle.

There is a suggestion within the specs that describes what you're seeing. Here's the relevant bit:
"Note, however, that every character in [UNICODE] has a glyph associated with it, and that the glyphs for undisplayable characters are enclosed in a dashed square as an indication that the actual character is undisplayable.

See RFC 3536 - Terminology Used in Internationalization in the IETF at the end of Section 5.

I'm looking at your sample page in FF3 on Linux, btw, but with the standard MS fonts.
posted by dammitjim at 9:59 AM on September 1, 2008


I propose naming it a cartouche, since we already use this word for "glyphs inside a box."
posted by SPrintF at 11:49 AM on September 1, 2008


Response by poster: I don't know why I didn't do this right away but here is an image of them.
posted by srboisvert at 4:26 AM on September 3, 2008


« Older Stop me from hurting myself or others   |   Editing DIVX directly in iMovie/FCP Newer »
This thread is closed to new comments.