Join 3,512 readers in helping fund MetaFilter (Hide)


How do you cut and paste HTML?
December 23, 2008 12:55 PM   Subscribe

How can I cut content out of a website and paste it cleanly into a word document?

I've tried Googling this and am getting nothing.

I'm trying to create a word document (for eventual conversion to pdf) of all the web-based press clippings for a musician. I would ideally like these clippings to be in the format they appeared in when they were published on the web. My end goal is a .pdf of all these nicely laid-out press clippings, interviews and reviews with graphics, photos, etc. intact.

Is there a short-cut for this, some kind of print-screen equivalent for grabbing entire web pages? Or is the html just too messy to be cut and pasted that simply?
posted by Bobby Bittman to Technology (13 answers total) 9 users marked this as a favorite
 
If you need it to look exactly the same, and you don't want to use screenshots, you could print each web page to PDF, then use something that talks PDF to cut out and keep just the bits you want.
posted by devnull at 12:58 PM on December 23, 2008


I use Primo PDF for things like this. It installs itself as a virtual printer so that you just hit print, then select Primo PDF, and it outputs a PDF to your desktop.
posted by sanka at 1:07 PM on December 23, 2008


Since hitting print and then saving as a PDF often mangles the layout (print function on web browsers tries to interporate pages in the best way to print them to paper), I like this service. It inserts a small link to the service into the bottom of each PDF page, but it does a good job and it is free.
posted by nnevvinn at 1:13 PM on December 23, 2008


You can screencap entire web page lengths with one of these two Firefox add-ons: Screengrab and Pearl Crescent Page Saver. With either extension, you can choose whether you want to save as JPG or PNG. Try both and see which one you like better. I have both installed since sometimes one will work on a page while the other doesn't.
posted by macguffin at 1:13 PM on December 23, 2008


Firefox Scrapbook.
posted by gregoreo at 1:24 PM on December 23, 2008


Content but not formatting? Paste it into the text editor first, then cut and past from there into Word. Pure ASCII goodness.
posted by Area Control at 2:14 PM on December 23, 2008


Pearl Crescent did the job perfectly. Thanks a million macguffin, and thank you everyone for taking the time to answer.
posted by Bobby Bittman at 2:28 PM on December 23, 2008


Saving as JPG or PNG will rasterize the text and such— you'll just have a big bitmap, and lose the information about what the text was and so on. Which might be fine for your purpose, I don't know; I'm just pointing out that it's a pretty lossy operation.
posted by hattifattener at 2:32 PM on December 23, 2008


I sometimes copy into notepad, then from notepad to word. There's a setting to choose paste defaults; it's available in word 2007 when you paste, and get the popover menu choice. I set mine to copy text only.
posted by theora55 at 2:34 PM on December 23, 2008


I think the OP wanted the formatting. But theora55, there's an extension - copy plain text - that will do what you're describing. I used to use notepad also, and I'm glad to not have to do that anymore.
posted by cashman at 8:02 PM on December 23, 2008


Use a Macintosh and print each page to PDF. Don’t even bother trying this on Windows. Surely you must know someone with a Mac if the subject is a musician.
posted by joeclark at 6:23 AM on December 24, 2008


+1 screen shots..
You can use the free tool Cropper (http://www.codeplex.com/cropper) which allows for easy screen capture..
posted by bbyboi at 2:09 PM on December 24, 2008


Screenshots will not leave you with selectable, resizable text. Printing to PDF (under ordinary circumstances) will.

Look, just do this on a Mac. It’ll take you mere minutes.
posted by joeclark at 8:02 PM on December 24, 2008


« Older Where can I buy an upright non...   |  After the recent issues with r... Newer »
This thread is closed to new comments.