I hate computers, etc.
January 16, 2010 8:59 AM   Subscribe

How can I fix weird HTML to copy it into a text document?

My Mac is full of problems, but the most annoying one is the extra spacing that seems to come along with copying text from a website. The text still looks like part of it is single spaced, part is double spaced, etc.
I've tried opening it in Bean, NeoOffice, and TextEdit, and changing the settings to "single-spacing" in each (the only thing I know to do), but nothing is working here. For what it's worth, this is the website that's causing me problems. Today's problem, that is.

—Also, how can I mass-remove links? Currently, I have to click on each one and go to "Edit Link" and remove the URL.
—Is there anything better than NeoOffice? It tends to open documents with more intuitive formatting sense than Bean, otherwise I hate it.
I also need columns to set up this index I'm trying to make (see link above), which I would like to keep on one page, but that isn't going well either. I kind of just want Word 98 again when everything was simple and easy.
posted by lhude sing cuccu to Computers & Internet (8 answers total) 1 user marked this as a favorite
 
Well, I'm not too clear on either your exact problem or desired end result—that page copies and pastes cleanly into TextEdit for me—but if you don't mind losing formatting like bolding and italicizing, you can hit shift-command-T to convert to plain text. That will kill all extraneous line breaks and links.
posted by Mr. Anthropomorphism at 9:14 AM on January 16, 2010 [3 favorites]


Shift+Control+Option+Command and the V key (as opposed to just Command+V) pastes text without carrying over any formatting. Don't know if it works with NeoOffice.
posted by qwip at 9:18 AM on January 16, 2010


Ah, good call qwip, but the command is Shift+Option+Command+V, no Control.
posted by Mr. Anthropomorphism at 9:24 AM on January 16, 2010


Response by poster: For some reason, all those commands make the "you can't do that!" sound, otherwise they would be amazing alternatives.
posted by lhude sing cuccu at 9:40 AM on January 16, 2010


Are you using Firefox? If so, try the Copy Plain Text extension.
posted by ocherdraco at 9:51 AM on January 16, 2010


HTML ignores most whitespace (with rare exceptions I won’t get into). Hence the source could use a lot of blank lines and you wouldn’t know till you pasted to a place where they become visible.

How much text do you need to copy and how often?

BBEdit can remove linebreaks and markup in one fell swoop each.
posted by joeclark at 10:12 AM on January 16, 2010


If any of the editors you've got can do search and replace on carriage returns / paragraph marks, you could replace two carriage returns with a single one and run that several times to eliminate all of the double spacing.
posted by XMLicious at 11:08 AM on January 16, 2010


Best answer: If you really just want to convert to plain text, Mr. Anthropomorphism's solution should be perfect. Are you sure you're running the command from within TextEdit and not your browser, and that you're pressing Command (⌘) and not Control? You can also use the menu: Format → Make Plain Text. You can even go to Format → Text → Spacing to just change the spacing, but that's a real pain.
posted by serathen at 11:10 AM on January 16, 2010


« Older New address, same old checkbook   |   Help us find a wedding planner! Newer »
This thread is closed to new comments.