What is … ?
December 8, 2020 4:30 PM

I'm guessing some odd Unicode character. But I'm on mobile and can't 'view source' or otherwise investigate. Might just be me and my phone. But I keep seeing it and it's driving me batshit.

Can be found here:

https://www.ctvnews.ca/mobile/politics/trump-signs-order-to-put-americans-at-head-of-vaccine-line-vows-to-work-with-world-1.5221941

"Until the epidemic is stamped out in the darkest corners of Bangladesh, it is not over for everybody … so the argument can be made that it is not an advantageous position for anybody to be advocating for one population to get it before another."

And at least one more time in that article. Seen it elsewhere and it is searchable.
posted by shoesfullofdust to Computers & Internet (17 answers total) 1 user marked this as a favorite
Since it's three different characters, I'd assume that something in the pipeline to displaying the article isn't transliterating something from whatever the original codeset is to Unicode, and instead is generating those three characters.

From context, I'd suspect it's supposed to be an em-dash — although I don't know the exact mechanism that's causing it.
posted by sagc at 4:37 PM on December 8, 2020


Yes, it's supposed to be an em-dash, the windsor star has it properly
posted by JZig at 4:39 PM on December 8, 2020


It looks the same to me on a mac desktop, both safari and chrome. When I do view source, it shows the same characters. So, maybe some kind of copy and paste or processing issue that predates the code itself.
posted by past unusual at 4:40 PM on December 8, 2020


It's funny, if you paste that into google you find it in a lot of places, but there's also this question on Stack Overflow from someone having a Python problem.

TLDR: I'd guess that a Python program on the server somewhere is mangling a special character, like sagc says.
posted by JoeZydeco at 4:40 PM on December 8, 2020


The same article is elsewhere, yahoo and others show "..." Instead for the bangladesh quote

ca yahoo finance
posted by TheAdamist at 4:41 PM on December 8, 2020


The bytes in question are ce93 c387 c2aa
posted by scruss at 4:43 PM on December 8, 2020


sagc - for an answer containing an em-dash

scruss - for the bytes in question

JoeZydeco - for the Python bit and being Zydeco
posted by shoesfullofdust at 4:51 PM on December 8, 2020


Thank you all! I can sleep tonight!
posted by shoesfullofdust at 4:53 PM on December 8, 2020


The intended character is horizontal ellipsis (…), Unicode code point 2026 (hexadecimal).

In the UTF-8 encoding, the byte sequence for that code point is (226, 128, 166) (decimal).

If this byte sequence is interpreted in the now-archaic "code page 437", each byte stands for one character: 226 is GREEK CAPITAL LETTER GAMMA “Γ”, 128 is LATIN CAPITAL LETTER C WITH CEDILLA “Ç”, and 166 is FEMININE ORDINAL INDICATOR “ª”.

Then some further part of the system turns that into this UTF-8 byte sequence in the HTML document: "…" The "character entities" are then rendered by your browser into the characters they name, and you get what you get.

Word of the day: Mojibake

I don't know that Python has anything in particular to do with it :)
posted by the antecedent of that pronoun at 4:53 PM on December 8, 2020


And JZig for showing the true way!
posted by shoesfullofdust at 4:55 PM on December 8, 2020


whoa!

the antecedent of that pronoun, you're gonna keep me up all night.

worrying.
posted by shoesfullofdust at 4:59 PM on December 8, 2020


or, I'll just order a custom "Mojibake" t-shirt and forget that I asked this question.
posted by shoesfullofdust at 5:03 PM on December 8, 2020


I thought it might be a simple question with a simple answer.

"Your assumptions were wrong. Good night!"
posted by shoesfullofdust at 5:11 PM on December 8, 2020


Laissez les bon temps rouler, mes amis!
posted by JoeZydeco at 5:44 PM on December 8, 2020


Thank you, shoesfullofdust, for asking this question! I concur that Mojibake is a great word discovery. Did a search for mojibake t-shirts and it led to Ojibwe t-shirts. So yes, a custom tee it will have to be.
posted by a humble nudibranch at 8:19 PM on December 8, 2020


> Laissez les bon temps rouler, mes amis!
Plus … change...
posted by Syllepsis at 8:20 PM on December 8, 2020




« Older Should I distance myself from friend?   |   Good things happening to nice people Newer »
This thread is closed to new comments.