How to handle this punctuation problem when using copy/paste
February 11, 2010 11:45 AM   Subscribe

Copy and paste issues with quote and apostrophe marks. [see within]

When I copy and paste material into my site, I get this
Editor's Note

which clearly is supposed to be "editor's note"

I know it is a problem in transferring from one place to another but if the paste includes a lot of material, it is a large amount of work to sort out the apostrophes and quote marks to render them as they ought to be.

I am using a Mac with Firefox browser.

Is there any solution that would make this easier?
posted by Postroad to Computers & Internet (16 answers total)
 
Perhaps you could detail both from where you are copying text and to where you are trying to paste it. You say "into my site" but I'm not sure what that means. An HTML editor? A comment field? Notepad?
posted by jckll at 11:49 AM on February 11, 2010


In addition to jckll's questions - could you also clarify whether the ' is what is rendering in your browser on the site or simply how it appears in the text field (or html source if you're copy+pasting directly into that)
posted by missmagenta at 11:52 AM on February 11, 2010


Copy and paste the whole document into notepad first, then copy the notepad text and paste that into your site. I think that should fix it.
posted by weapons-grade pandemonium at 11:54 AM on February 11, 2010


Educated guess, since we need more info as noted:

Your editor is encoding the quotes, because quotes are very problematic to programs (because quotes often represent the end of a string). The problem is that it's not getting decoded on the way back out. But I digress...

Try "escaping" the quote, which may take some trial-and-error:

editor''s note
editor'''s note
editor\'s note

One of those may work...
posted by mkultra at 12:10 PM on February 11, 2010


You could install the Auto Copy Firefox addon, which (among other things), will allow you to make the default copy action 'copy as plain text'. When copying text from the browser to be pasted anywhere else (including into another browser window/tab), 'copy as plain text' should eliminate the problem you mention.
posted by namewithoutwords at 12:24 PM on February 11, 2010


Forgive me for not being sufficiently clear about all this. I am over 80 and as you can well guess have no tech background, or if I have, I can not remember it!
I copy from a site (an article), and the proper punctuation is at that site. I usually next use the Press This for quick posting to my site (Moveable type has Press this in addition to of course their regular template; the press this is shortcut to template, and thus to my blog.

The "change" in punctuation then appears on my site, though it was not on the original.
That one suggestion to use AutoCopy seems not an option. It does not work in FF3.6 and may be in any case for Microsoft users.
thanks
posted by Postroad at 12:43 PM on February 11, 2010


You might have to just replace the quote mark by hand, I'm afraid. That's what I do when I see funky characters after cutting and pasting.

Put the copied text into Word or Textpad or Notepad or some other text editor, and you can do Find and Replace (apple+F, then paste in the funky character, then choose an apostrophe to replace it with). Not sure if that works in Movable Type, though -- someone more technical than me should be able to help.
posted by vickyverky at 12:57 PM on February 11, 2010


You can also use "Paste Special"-->Text Only into Word as an intermediate step. Then copy and paste to where you want to use it.
posted by trox at 1:01 PM on February 11, 2010


The characters on a Web page can be stored as the actual characters themselves, or, for many kinds of character, as so-called entities that describe the characters. These begin with ampersand, maybe followed by a #, then have a number or something resembling a word, then finish up with a semicolon. Your browser (like Firefox) then just shows you the real intended character, not the code for it that the entity represents.

For whatever reason, the system that published the text you are copying used an entity for each apostrophe. When you copy that text, for some reason you are copying the underlying entity and not the character it represents. You are copying ampersand-apos-semicolon and not just .

You don’t really need to know any more than that. You can just use search-and-replace in whatever program you choose (e.g., Notepad, Word) to turn ampersand-apos-semicolon into a real apostrophe.

If this keeps happening day after day, then you would look at more automated solutions and/or learn more about the problem. Incidentally, you didn’t do anything wrong.
posted by joeclark at 1:18 PM on February 11, 2010 [1 favorite]


Here's what I suspect is happening.

When you copy from another site, the apostrophe or quotation mark is being copied as ' and " -- those are entity codes, and I won't tell you what those are, except that it's not surprising you get those instead of apostrophes and quotation marks.

Now, when you paste that into your site, what SHOULD happen is those ' and " instances should go into your site just fine, and when you view the page in a web browser you should see those rendered as apostrophes and quotation marks.

However, what i think IS happening is that when you paste into your site, your site is changing the &amp character to &, and so the rest of it is just being displayed as text. That would give you " and & on your site.

The solution is probably to copy and paste using your operating system's controls, not the Paste This command, since Paste This apparently has a bug dealing with entity codes.
posted by davejay at 1:35 PM on February 11, 2010


Ack! ...your site is changing the & character to &...
posted by davejay at 1:36 PM on February 11, 2010


I'm not familiar with the Moveable Type feature you are using but there are a couple of workarounds you can try.

1. (In Firefox) select what you want to copy, right click, and go "show selection source". Paste this into the HTML entry for your website rather than rich text. This will copy the content exactly as you see it on the original site.

2. Use mass find/replace to replace all instances of, e.g. ' with ', etc. You should be able to record a Macro in Word (or similar) to do this all automatically with the press of a button.
posted by turkeyphant at 1:37 PM on February 11, 2010


You could try this: copy all your text into Textedit, then choose Format>Make Plain Text, then copy the text into your site.
posted by schmichael at 1:40 PM on February 11, 2010


On your Mac, copy whatever text you want to paste into MovableType... and paste it into TextEdit instead. In the Format menu, select "Make Plain Text." Then, copy THAT and paste it into MovableType. Convert text to "plain text" first. Then, copy that into your blog.

Here's what's going on:

The problem you're seeing is that certain characters don't exist in html and others have more important roles than simply being text. This is why & is often translated to &

& is code for &.

Any character that isn't a letter or a number can end up getting screwed up.

The biggest issue is those damn cutesy curly quotes word processing apps love to use. They look better than standard " type quotes, but do you see how there isn't a keyboard key for them? Or, I should say, do you see how the same keyboard key is used for open and close quotes? Since there's no html equivalent for open and close quotes, a browser just turns them into garbage. This happens with curly apostrophes too.

Plain text is the solution. This isn't just true in MovableType or Blogger or whatever. You'd have the same results if you pasted those unknown characters into an html page.

I know, you're probably thinking "but those characters aren't unknown! They look right in the text I'm pasting." Sometimes, the changes made by word processors can't even be seen unless you know what you're looking for. The worst offender is the ellipsis. Some douchebag (probably at microsoft) saw those three dots ... and thought "Wouldn't it be better if, when a user types three periods in a row, we converted those into one character with three dots instead of three characters with one dot?"

No. It wouldn't be better. It would actually be much worse because web browsers won't know what to do when they find a character with three dots. It becomes gibberish. And that gibberish is what you're seeing.

Convert to plain text.

Cheers!
posted by 2oh1 at 1:57 PM on February 11, 2010


Postroad, I used to have that problem. It happens frequently when you copy "curly quotes" from Word documents and paste them into a blog post. It happened in some other instances, too, and drove me crazy.

If AutoCopy doesn't work for you, try the Copy Plain Text Firefox add on - then you simply right click and select "copy plain text." That has solved the problem for me.

weapons-grade pandemonium's suggestion worked for me, too.
posted by madamjujujive at 2:04 PM on February 11, 2010


The hack that's most easily solved this for me is to paste into Notepad FIRST then copy from Notepad and paste into the blog editor.
posted by eleanna at 2:21 PM on February 11, 2010


« Older I have decided to get a netboo...   |  I'm having car trouble. I kno... Newer »
This thread is closed to new comments.