Unicode replacement characters
September 1, 2008 5:52 AM Subscribe
Is there a name for the domino like box characters with 4 characters inside that display when you don't have support for a language script?
There a few examples on this page that show up for me. Bengal, Bhutan and Khmer for example show up as boxes with 4 characters inside. The first two characters are often in common for all the boxes within each language example.
In case it is system dependent - Windows XP, Firefox 3.0.1, English(US).
There a few examples on this page that show up for me. Bengal, Bhutan and Khmer for example show up as boxes with 4 characters inside. The first two characters are often in common for all the boxes within each language example.
In case it is system dependent - Windows XP, Firefox 3.0.1, English(US).
Argh, excuse me, that should be "interpretable but unrenderable characters" in the first sentence of my previous comment.
posted by letourneau at 9:27 AM on September 1, 2008
posted by letourneau at 9:27 AM on September 1, 2008
Here's a document from Microsoft regarding OpenType fonts that refers to the "I don't know how to render this character" glyph as the ".notdef glyph" (which term apparently dates back to the TrueType specs). OK, I'll stop stacking up comments now.
posted by letourneau at 9:32 AM on September 1, 2008
posted by letourneau at 9:32 AM on September 1, 2008
yeah, I looked on the Unicode pages and didn't see anything like this so this appears to be a MS-specific thing. Your samples show different glyphs on my Mac, no "dominos".
posted by troy at 9:47 AM on September 1, 2008
posted by troy at 9:47 AM on September 1, 2008
"Undisplayable character glyph" is the best I can come up with.
It seems like there's no other standard name for it, since the rules don't specify exactly how to represent the fact that there's an undisplayable character. And nobody seems to have given it a colloquial name yet (like R-ball for ®). It's left up to the user agent (in this case, your web browser) to handle.
There is a suggestion within the specs that describes what you're seeing. Here's the relevant bit:
"Note, however, that every character in [UNICODE] has a glyph associated with it, and that the glyphs for undisplayable characters are enclosed in a dashed square as an indication that the actual character is undisplayable.
See RFC 3536 - Terminology Used in Internationalization in the IETF at the end of Section 5.
I'm looking at your sample page in FF3 on Linux, btw, but with the standard MS fonts.
posted by dammitjim at 9:59 AM on September 1, 2008
It seems like there's no other standard name for it, since the rules don't specify exactly how to represent the fact that there's an undisplayable character. And nobody seems to have given it a colloquial name yet (like R-ball for ®). It's left up to the user agent (in this case, your web browser) to handle.
There is a suggestion within the specs that describes what you're seeing. Here's the relevant bit:
"Note, however, that every character in [UNICODE] has a glyph associated with it, and that the glyphs for undisplayable characters are enclosed in a dashed square as an indication that the actual character is undisplayable.
See RFC 3536 - Terminology Used in Internationalization in the IETF at the end of Section 5.
I'm looking at your sample page in FF3 on Linux, btw, but with the standard MS fonts.
posted by dammitjim at 9:59 AM on September 1, 2008
I propose naming it a cartouche, since we already use this word for "glyphs inside a box."
posted by SPrintF at 11:49 AM on September 1, 2008
posted by SPrintF at 11:49 AM on September 1, 2008
Response by poster: I don't know why I didn't do this right away but here is an image of them.
posted by srboisvert at 4:26 AM on September 3, 2008
posted by srboisvert at 4:26 AM on September 3, 2008
This thread is closed to new comments.
posted by letourneau at 9:26 AM on September 1, 2008