How do I strip .docx of its styling but retain semantic markup?
March 13, 2009 8:25 AM Subscribe
How can I automatically strip a .doc and .docx files of their styling but retain any semantic markup with the aim of putting them online?
I'm creating a wiki like app to allow the members of an organization update some help files stored online. I've found that what they will typically try to do is copy text directly from a word document straight in to the text area which strips it of all of its styling and semantic markup.
Therefore, I tried using tinyMCE which changes the textarea into a rich text editor. Unfortunately, this results in the word styling overwriting the site styling.
I think what I need can best be illustrated by an example: If a word document contains a paragraph, I need that paragraph wrapped in tags and stripped of any styling so that I can then style that paragraph as I like using css. I then need this for headings, lists, images, tables e.t.c.
posted by Fluffy654 to computers & internet (6 answers total) 4 users marked this as a favorite
posted by bricoleur at 8:37 AM on March 13, 2009