How can I get clean html from MS-word, to paste into thunderbird?
October 12, 2009 7:53 PM

Sometimes, when I'm sending long emails, I like to compose in my word processor - MS-word (2003 on one machine, 2007 on another), and paste into my email client - Thunderbird.

But when I do, there are all sorts of little (and not so little) formatting hiccups. I gather the problem is mostly that Word uses all sorts of crazy non-standard stuff in its html output.

Is there a way to fix this problem? I way to make word produce cleaner HTML? A tool that will turn Word html into something friendlier? A word-paste-cleanup plugin for Thunderbird?

(And : I know I could compose with Open Office or Google Docs. But mostly I like Word, and would like to keep using it if I can...)
posted by ManInSuit to Computers & Internet (21 answers total)
Save your file as a text file.
posted by hapax_legomenon at 8:10 PM on October 12, 2009

Do you have Dreamweaver? Various versions will clean up a document saved as HTML via Word pretty well. DW CS4 uses Commands | Clean Up Word HTML.

Do you always need HTML formatting? I find that almost everything I want to say works as plain text, and that right clicking in Thunderbord and choosing Paste Without Formatting works beautifully.
posted by maudlin at 8:12 PM on October 12, 2009

Whenever I have formatting problems, I just copy and paste into Notepad/TextEdit, then copy and paste to the new application. It removes all formatting and gives you nice clean text.

Actually when I'm writing a long e-mail or web post, I'll usually compose it in Notepad/Textedit in case something happens with the webpage.
posted by jander03 at 8:12 PM on October 12, 2009

Oh, to be clear:

The long emails I'm talking about: I'd like them to include some formatting (some bold, some italics, etc) so I'd rather not compose or save in plain text. I want clean html-formatted email, not clean plain-text email. I know there are a lot of reasons to send emails in plain-text, but sometimes I want formatting.
posted by ManInSuit at 8:18 PM on October 12, 2009

Maudlin- the DW clean-up word html sounds like what I'm looking for. But I don't have DW running.

I wonder- is there an online tool that does that? That might be a way to do this...
posted by ManInSuit at 8:21 PM on October 12, 2009

Microsoft make (or made) a plugin/filter for Word which saves clean HTML (rather than the weird HTML Word's regular HTML save produces.) It's on my computer at work and does a good job. You should be able to find it if you dig around ... sorry, too many hours on the machine already for me to search just now.
posted by anadem at 8:25 PM on October 12, 2009 [1 favorite]

anadem - That is exactly the sort of thing I was looking for before I posted here! I did not find it. Can anyone point me in the direction of this mythical plugin?
posted by ManInSuit at 8:28 PM on October 12, 2009

If all you want is some nice, simple formatting (bold and italics, maybe some images and links), you can compose directly in Thunderbird and add HTML formatting from the buttons and dropdown lists, saving as you go.

Is there something about Thunderbird's HTML or composing in the window that doesn't work for you? I can successfully create all kinds of formatting, including images and links, directly in Thunderbird. I'm not being argumentative about your premises -- you may really prefer Word for perfectly valid reasons -- but if you want rock-solid HTML that's easy to use, Thunderbird already has it. Sorry if I'm missing something.

On preview: the last time I remember using the Microsoft official Word HTML plug-in was for Word 2000. This third party tool (which I haven't tried) is designed for 2003. You may find more for 2003 and 2007 with Google searches like microsoft word 2007 clean html.
posted by maudlin at 8:37 PM on October 12, 2009 [1 favorite]

One trick I've seen done is to mail it as an attachment to your gmail address, then view the attachment as HTML. Copy from there or the view source window.
posted by Pinback at 8:52 PM on October 12, 2009

Maudlin (and others): Maybe I'm being stubborn in wanting to compose in MS-word. Maybe I can explain myself:

For shorter emails, thunderbird is great. But for some longer stuff (eg: in-line newsletters I send to a neighbourhood group I run), I'm a lot happier composing in word. These emails are a few pages long, and often include a lot of formatting. Word is a full-blown word processor in ways that the thunderbird compose window isn't. I use the grammar checker, I like the spell-check better, I find it easier to save drafts, etc.
posted by ManInSuit at 9:07 PM on October 12, 2009

Pinback - I just tried your suggestion. At first glance, that seems to deal with a lot of the bigger problems. Cool. I'll try some more... (One thing it doesn't do as I'd like: It turns single space paragraphs into double-space. I have a feeling there may not be any way around that...)
posted by ManInSuit at 9:12 PM on October 12, 2009

OK, here's one more online filter you can try. Worked beautifully with Word 2003 and should work with 2007.
posted by maudlin at 9:17 PM on October 12, 2009

Well, there's demoronizer — which is more of a 2002-era tool for this job. I don't know if it'll do anything desirable to modern MS HTML.
posted by hattifattener at 9:28 PM on October 12, 2009

Save as RTF.
posted by pompomtom at 10:28 PM on October 12, 2009

Use MS Outlook? You can use word as its default e-mail editor.
posted by wongcorgi at 3:46 AM on October 13, 2009

2nding save as rtf.
posted by Obscure Reference at 4:05 AM on October 13, 2009

My suggestion, and what I usually end up doing. Copy in word, pate into notepad. Copy in notepad, paste into thunderbird. Then apply bold/italic formatting there. if it helps, you can add some sort of signifier (BOLDME) so that you can find, delete and bold.
posted by CharlesV42 at 4:58 AM on October 13, 2009

wongcorgi: I used to composed in word and paste into Outlook. It worked great! I've since moved from Outlook to Thunderbird. I like Thunderbird better for lots of reasons, but I miss the "paste-from-word" that worked so well in outlook.

Charlesv42 - what you do is pretty much what I've been doing for a while. But I figure - there's got to be a better way. Re-formatting all my text by hand is time-consuming and error-prone, and the kind of work I can't help but think I shouldn't have to do in a world where there are computers.

pompomtom, obscure: I tried saving as RTF. Unless I'm missing something, this doesn't really eliminate the weird word html cruft when I paste the text into Thunderbird

Maudlin - Both those online tools look promising!! I'll go play around with them a bit.
posted by ManInSuit at 5:44 AM on October 13, 2009

These emails are a few pages long, and often include a lot of formatting.

This suggests to me that you shouldn't be sending them as emails, but rather as attachments. Why can't you send a brief email that says "here's the info you wanted - see attachment"?
posted by chrisamiller at 7:38 AM on October 13, 2009

Chrisamiller - I appreciate that there are times when it makes more sense to send an attachment, and when that's the case, I send an attachment.

But for some emails, I want to send something that's a few pages long (maybe 1000 words or so) and that is in the message body. There are a few reasons for this: it's more quickly accessible to readers, easier to scan without opening external reader software, doesn't create compatibility problems for people who don't have the appropriate reader software installed, etc, etc.
posted by ManInSuit at 9:35 AM on October 13, 2009

I use Textism.
posted by elle.jeezy at 10:52 AM on October 13, 2009

