Pasting Crappy Microsoft World Code Into Movable Type
February 20, 2008 12:00 PM   Subscribe

How can I clean up crappy Microsoft Word code in Movable Type 4?

Microsoft Word includes crap code like this when pasted into Movable Type:

[p class="MsoNormal"][font face="Arial" size="2"][span style="font-size: 10pt; font-family: Arial;"]
(I replaced angled brackets with square ones for display purposes.)

Is there a plugin or something I can use to remove all the gunk?
posted by kirkaracha to Computers & Internet (12 answers total)
 
Try pasting into Notepad first, the cut and paste again from Notepad into MT.
posted by jeffamaphone at 12:04 PM on February 20, 2008


Notepad, yes. This is not a failing of MT, it's an evil of MS. It happens in WordPress, too.
posted by DarlingBri at 12:06 PM on February 20, 2008


I've heard Dreamweaver is good about stripping MS Word docs.
posted by bprater at 12:20 PM on February 20, 2008


Kinda previously. HTMLTidy will strip out a large chunk of that gunk for you too.
posted by unixrat at 12:20 PM on February 20, 2008


Notepad works fine, but if you'll be doing it a lot try this.
posted by zazerr at 12:23 PM on February 20, 2008


If you're using MT 3.x, you can use the NaughtyWordChars plugin. It will strip away all that nastiness. I am also fairly sure that this feature was rolled into MT4.
posted by capndesign at 1:44 PM on February 20, 2008


>Microsoft Word includes crap code like this when pasted into Movable Type

How? If you cut and paste from Microsoft word into a text box in a browser, (like the one I'm typing this in) you should just get plain text.

Which Word are you using? Which browser? How are you cutting and pasting?

And, capn, the plugin doesn't remove HTML from word text, surely? Just characters from Microsoft's non-standard charset.
posted by AmbroseChapel at 2:13 PM on February 20, 2008


Response by poster: The problem comes from Writing In Microsoft Word, Publishing In Mt4.

I'm not the one copying-and-pasting, so I don't know the Word version or the browser. I suspect it's a combination of Word and Movable Type's Rich Text formatting.
posted by kirkaracha at 2:25 PM on February 20, 2008


Well, if you're using MT 4.1, you could try installing the FCK Editor plugin and make that the default editor for all authors. I have yet to try it myself, but supposedly it does a much better job of outputting semantic code.
posted by capndesign at 3:21 PM on February 20, 2008


I know it doesn't seem to make sense, but cutting and pasting "just the text" from a Word doc brings along a lot of crappy weird Microsoft-specific formatting that renders hideously in html. It happens in every blogging platform (typepad, wordpress, blogger) I've tried, as well as in MediaWiki. Notepad as an intermediary is a quick and dirty solution for now. The newest version of Word (2007), though, does have a "publish as a blog post" option that seems to work nicely, at least in the trial version. So, hope for the future!
posted by donnagirl at 8:35 PM on February 20, 2008


I know I'm kind of derailing, but donnagirl, which version of Word? Which browser? I'm totally scratching my head over this. You say
cutting and pasting "just the text" from a Word doc brings along a lot of crappy weird Microsoft-specific formatting that renders hideously in html.
but here I am, doing it, and it doesn't happen to me.

kirkaracha's link mentions Word 97 and I'm testing using Word 2003. Maybe that's it?
posted by AmbroseChapel at 6:08 PM on February 21, 2008


I apologise -- I now get it.

This doesn't happen when you paste into a normal browser text box. It happens when you paste into a WYSIWYG editor in a browser text box.

Useful link: Don't let Microsoft Ruin Your Blog
posted by AmbroseChapel at 6:17 PM on February 21, 2008


« Older Vegetarian needs "respectable" article about...   |   What's a catchy theme for a corporate conference? Newer »
This thread is closed to new comments.