How to fix an Atom feed?
October 26, 2005 1:58 AM Subscribe
I'm told that my weblog's atom feed is broken: according to feedvalidator.org this is because 'my feed appears to be encoded as "utf-8", but my server is reporting "US-ASCII". Is it advisable that I should persuade my server to report an 'utf-8', ecoding, and, if so, how?
Here's the feed in question. I'm running MT 3.2: it's only since I upgraded to this version of MT that the site has been trying to encode in utf-8. According to the feedvalidator site, I should 'either ensure that the charset parameter of the HTTP Content-Type header matches the encoding declaration,' or 'ensure that the server makes no claims about the encoding.' How do I do this? I have limited (CPanelX) access to my webserver and a decidedly faint and incomplete grasp of the technicalities involved.
Here's the feed in question. I'm running MT 3.2: it's only since I upgraded to this version of MT that the site has been trying to encode in utf-8. According to the feedvalidator site, I should 'either ensure that the charset parameter of the HTTP Content-Type header matches the encoding declaration,' or 'ensure that the server makes no claims about the encoding.' How do I do this? I have limited (CPanelX) access to my webserver and a decidedly faint and incomplete grasp of the technicalities involved.
Assuming you haven't modified your server since posting your question, the HTTP headers in your feed don't specify any particular character encoding. The Atom feed you're serving does specify utf-8 encoding, as nmiell's suggestion above. So any reasonable software is going to have the information it needs to know your encoding. Ie: no major problem that I see.
It would be marginally better to convince Apache to serve a content encoding in the HTTP headers. But be sure it's the right one! I don't know how to convince Apache to do that, sorry.
posted by Nelson at 3:03 AM on October 26, 2005
It would be marginally better to convince Apache to serve a content encoding in the HTTP headers. But be sure it's the right one! I don't know how to convince Apache to do that, sorry.
posted by Nelson at 3:03 AM on October 26, 2005
Response by poster: Thanks nmiell, Nelson; I've changed the atom feed template to specify UTF-8. Feedvalidator listed some other problems with atom.xml, which I'm working on fixing now - I guess it could have been one of these other issues that led to the feed's being unreadable.
posted by misteraitch at 3:10 AM on October 26, 2005
posted by misteraitch at 3:10 AM on October 26, 2005
Response by poster: OK: it looks like I was thrown by the validator displaying the utf-8 message first: the feed passes for valid after I fixed some of the other issues with invalid characters in entry titles, etc.
posted by misteraitch at 4:00 AM on October 26, 2005
posted by misteraitch at 4:00 AM on October 26, 2005
Best answer: Assuming you haven't modified your server since posting your question, the HTTP headers in your feed don't specify any particular character encoding
Not true. It's currently served as "text/xml". Not mentioning an encoding automatically means the feed is "US-ASCII", not that no encoding is specified.
Three ways to solve this:
1) Send "Content-Type: application/xml" or "applications/atom+xml". This results in the feedreader/validator looking inside the document at the < ?xml ?> declaration.
2) Send "Content-Type: text/xml; charset=utf-8"
3) Escape all utf-8 characters so that the feed is valid US-ASCII. This requires replacing all character codes 128 or above with &128; or whatever.>
posted by cillit bang at 4:42 AM on October 26, 2005
Not true. It's currently served as "text/xml". Not mentioning an encoding automatically means the feed is "US-ASCII", not that no encoding is specified.
Three ways to solve this:
1) Send "Content-Type: application/xml" or "applications/atom+xml". This results in the feedreader/validator looking inside the document at the < ?xml ?> declaration.
2) Send "Content-Type: text/xml; charset=utf-8"
3) Escape all utf-8 characters so that the feed is valid US-ASCII. This requires replacing all character codes 128 or above with &128; or whatever.>
posted by cillit bang at 4:42 AM on October 26, 2005
Response by poster: cillit bang—forgive my ingorance, but in your first two solutions, where would I put those declarations? In my confused muddlings after the MT3.2 upgrade, I had resorted in effect to your solution #3 to get stuff to display OK: but it seems this caused some Atom validity issues when I included escaped characters in weblog entry titles...
posted by misteraitch at 5:25 AM on October 26, 2005
posted by misteraitch at 5:25 AM on October 26, 2005
The headers are sent before the actual document. I don't use Movable Type so I don't know how to do this, but you can't change them by editing the template.
If you go with option 3, you need to remove the "utf-8" from the < ?xml ?> declaration.>
posted by cillit bang at 5:42 AM on October 26, 2005
If you go with option 3, you need to remove the "utf-8" from the < ?xml ?> declaration.>
posted by cillit bang at 5:42 AM on October 26, 2005
Best answer: If your web server runs Apache, you can put the following line into a file called ".htaccess" in the directory containing your feed (assumes your web server is configured to allow this):
AddCharset UTF-8 .xml
See the W3C internationalization FAQ for details.
posted by mbrubeck at 6:50 AM on October 26, 2005
AddCharset UTF-8 .xml
See the W3C internationalization FAQ for details.
posted by mbrubeck at 6:50 AM on October 26, 2005
Response by poster: That did the trick, mbrubeck, many thanks. My thanks also to cillit bang, for your answers.
posted by misteraitch at 11:43 AM on October 26, 2005
posted by misteraitch at 11:43 AM on October 26, 2005
This thread is closed to new comments.
The easiest way to fix this is to modify Movable Type so that it outputs
<?xml version="1.0" encoding="utf-8"?>
as the first line of atom.xmlposted by nmiell at 2:38 AM on October 26, 2005