Need some help with fixing a del.icio.us bookmarklet ...
March 7, 2006 11:05 AM   Subscribe

Help me untangle a few weirdities in my fine, freshly hand-rolled Cuban cigar ... er ... Javascript bookmarklet.

I need some assistance doing some slight tweaks to a JavaScript bookmarklet that I hand-rolled from a bookmarklet, a Greasemonkey script, and some dumb luck.

The Javascript bookmarklet code, when given some whitespace for readability, is:

javascript:h=location.href;

t=document.title;

e = "" + (window.getSelection ? window.getSelection() : document.getSelection ? document.getSelection() : document.selection.createRange().text);

e = e.replace(/\"/g, "'");

e = e.replace(/\xa0/g, " ");

e = e.replace(/\xa9/g, "\(c\)");

e = e.replace(/\xae/g, "\(r\)");

e = e.replace(/\xb7/g, "*");

e = e.replace(/\u2018/g, "'");

e = e.replace(/\u2019/g, "'");

e = e.replace(/\u201c/g, "'");

e = e.replace(/\u201d/g, "'");

e = e.replace(/\u8220/g, "'");

e = e.replace(/\u8221/g, "'");

e = e.replace(/\u2026/g, "...");

e = e.replace(/\u2002/g, " ");

e = e.replace(/\u2003/g, " ");

e = e.replace(/\u2009/g, " ");

e = e.replace(/\u2012/g, "--");

e = e.replace(/\u2013/g, "--");

e = e.replace(/\u2014/g, "--");

e = e.replace(/\u2015/g, "--");

e = e.replace(/\u2122/g, "\(tm\)");

if (e!=null) location="http://del.icio.us/WCityMike?jump=close&url=" + encodeURIComponent(h) + "&title=" + encodeURIComponent(t) + "&extended=\"" + encodeURIComponent(e).replace(/ /g, "+") + "\"";

void 0

(Of course, when it becomes a bookmarklet, I replace spaces with percentage-sign-20 and quotes with percentage-sign-22.)

If that doesn't come across properly, forgive the self-link, but the script can also be found here.

The purpose of this bookmarklet is to take the selected text, do some swapouts of fancy stuff for plain stuff, and toss it into del.icio.us. It also is supposed to swap out double quotation marks in the selected text for single quotation marks, because it then surrounds the selected text with quotation marks for the excerpt. So, for example, if I selected the text:

"Crash" won last night's Oscar ceremony in a surprise upset.

The bookmarklet should take the text into del.icio.us as:

"'Crash' won last night's Oscar ceremony in a surprise upset."

I basically mashed together this bookmarklet because I use del.icio.us more as a linklog publishing system than as an online bookmark manager, and I wanted to automate the process of filling out the description field.

I'm looking for two tweaks in particular.

First, I'd like it to automatically remove any line feeds or carriage returns in the selected text. I use a variant of this bookmarklet with the Mac OS X application Cocoalicious, and line feeds or carriage returns (I'm not sure which is which) get preserved and carried into the description field. (I do not think this is Cocoalicious' fault, as the JavaScript search-and-replace stuff that goes on happens before the data's passed off to the cocoalicious: URL, which when triggered prompts Cocoalicious to open with the appropriate data.) I'd prefer no carriage returns get carried over. How do I do that?

Secondly, I've found that on some webpages - perhaps 35-45% of them -- it does not catch double quotation marks, and I'm not sure what I'm missing. An example of text it doesn't match is the phrase:

You'll never miss a song even in the bathroom with Atech's "iLounge hybrid toilet paper dispenser/iPod dock".

on this webpage. (Keep in mind the quotes in the above phrase may have been fancy-fied by MetaFilter itself; if you're helping, you may want to go to the original webpage to see the exact characters I'm speaking of.)

I used a program called UnicodeChecker, which, amongst its features, will supposedly tell you the Unicode value of characters in pasted text. I used that to look at the apostrophes on this page, and if I'm reading my code right, the bookmarklet should catch these quotation marks. It's not.

Finally, I'm obviously not a coder. If you have any tweaks that will optimize the code yet preserve its understandability for a newbie like me, or if you know of any other good resources or examples to include in the "dumbification" process (i.e., turn this code into this plain text), please feel free to mention them.

If there's a particularly recalcitrant code that is not going through Metafilter's posting box well, feel free to e-mail it to me. My address is in my profile.

Much obliged in advance!

Thank you!
posted by WCityMike to Computers & Internet (19 answers total)
 
This sounds like what the del.icio.us Firefox extension does (in terms of handling selected text etc). Have you tried that and it does not meet your needs?
posted by misterbrandt at 12:46 PM on March 7, 2006


Response by poster: Misterbrandt, I'm a Mac OS X user, and Firefox is, in some ways, not quite there yet on Mac OS X, at least IMHO. I tried it for a while. And are you really quite certain that it encompasses the description in quotation marks, as my bookmarklet does, and "dumbifies" the text inside? I don't believe it does.
posted by WCityMike at 12:53 PM on March 7, 2006


WCityMike, when I saw "GreaseMonkey" in your description, I assumed you were on Firefox. My bad.

I just selected your text above (note that the carriage returns are included in the selection):
phrase:

You'll never miss a song even in the bathroom with Atech's "iLounge hybrid toilet paper dispenser/iPod dock".

on
and the delicious extension does this to it:
phrase: You'll never miss a song even in the bathroom with Atech's "iLounge hybrid toilet paper dispenser/iPod dock". on
If the extension looks like its logic does what you want, you might crack it open (for personal use only, of course :) ).

If that extension (or Firefox in general) does not work for you, I have another thought (not directly related to solving your problem, but it will make it easier...): Rather than try to compress your code into a one-line bookmarklet, you can use the bookmarklet to load an external javascript file, and in that file you can have all your code nicely formatted and easily maintainable. I first saw this in the Slayeroffice favelets, e.g.:

javascript:s=document.body.appendChild(document.createElement('script'));s.id='sst';s.language='javascript';void(s.src='http://slayeroffice.com/tools/js_tree/js_tree.js');
posted by misterbrandt at 1:08 PM on March 7, 2006


Well as to the first part.

e = e.replace( /\r\n/," "); // Assuming a space is wanted
posted by bitdamaged at 1:30 PM on March 7, 2006


Response by poster: Bitdamaged, as for the carriage return, that didn't work as is. However, when I separated it out into:

e = e.replace(/\r/g, " ");
e = e.replace(/\n/g, " ");

it did. Thanks.
posted by WCityMike at 2:04 PM on March 7, 2006


Response by poster: So ... anyone have any input as to the slightly trickier problem, how to tackle the weird non-compliant apostrophes? Am I missing some s00per-s1kr1t way of apostrophes?
posted by WCityMike at 2:05 PM on March 7, 2006


OK first of all, why do you need to do this? You just don't like smart quotes, or they actually cause a problem with Delicious in some way?

Secondly, if it's not replacing some kinds of quotes, then we probably just need to figure out which. Some sites might be using older smart quotes in the form ’ or similar.

Thirdly, the best way to simplify code like this, where a long list of identical operations has to be done repetitively, is to use an array and a function, as in

things_to_replace = new Array('foo','bar');
function replacethings() {
for(i=0;i>things_to_replace.length;i++){
//do your replacement
}
}


Although for a bookmarklet that might be more hassle than it's worth.

Fourthly, Firefox not quite there yet on Mac OS X? I think you should try it again. You're using Safari?

Fifthly, bitdamaged's script is fine except on Mac you may find you just want \r, not \r\n.
posted by AmbroseChapel at 2:05 PM on March 7, 2006


Response by poster: AmbroseChapel, thanks for the heads-up. I think the code you demonstrate there is a little above my head -- sorry. As for Firefox, believe me, I gave it a very good college try for an extended period of time. It just didn't end up working for me. I may try again at some point in the future, especially since the Intel Mac build of it rocks in how quickly it is.

As for the recalcitrant quotes, it appears that it's hand-coded into the document in the form of ampersand-pound-8220 and ampersand-pound-8221 HTML entities. I thought the "\u8220" and "\u8221" would thus be catching those, no?
posted by WCityMike at 2:12 PM on March 7, 2006


I think the code you demonstrate there is a little above my head -- sorry.

Well it's not finished code, anyway. It's just a demonstration, the point of which is that rather than have 20 or 21 or 22 actions, (replace thing 1; replace thing 2; replace thing 3) you have a list of things and one action, replace.

Think of it as a shopping list. You go to the shop with a list that said "bread, eggs, beer" or whatever. You don't write "buy bread, buy eggs, buy beer" because the "buy" is implicit.

In my code the array is the "shopping list" -- in your case, the replacement list, and the function is the "buy" part.
posted by AmbroseChapel at 2:32 PM on March 7, 2006


As for the recalcitrant quotes, it appears that it's hand-coded into the document in the form of ampersand-pound-8220 and ampersand-pound-8221 HTML entities. I thought the "\u8220" and "\u8221" would thus be catching those, no?

Show me the money! I mean, the page. And, no.

Again, it's a bit technical, but there are different ways of encoding pages and different ways of encoding special characters. Your script is going to have a lot of work to do to catch all possible variations.
posted by AmbroseChapel at 2:34 PM on March 7, 2006


I am probaby being less and less helpful, but I dug around in the guts of the delicious FF extension a bit, and this is what I think is doing the cleaning of the selected text:
if (!charlen)
charlen = 4096;
if (charlen < searchstr.length) {br>
var pattern = new RegExp("^(?:\\s*.){0," + charlen + "}");
pattern.test(searchStr);
searchStr = RegExp.lastMatch;
}

searchStr = searchStr.replace(/^\s+/, "");
searchStr = searchStr.replace(/\s+$/, "");
searchStr = searchStr.replace(/\s+/g, " ");
posted by misterbrandt at 2:52 PM on March 7, 2006


Response by poster: AmbroseChapel -- the page is referenced in the question, complete with link and the specific text. =-)

Misterbrandt, is the code you excerpt from the Firefox extension the code that handles line feeds?
posted by WCityMike at 3:00 PM on March 7, 2006


That code doesn't handle linefeeds, it handles all whitespace characters. It strips any from the start of the string any from the end of the string, and then replaces any string of one or more in the middle of the text with one space.

Do linebreaks count as whitespace? I don't think they can be made to, and you'll have to do them seperately. I'll look that up though.
posted by AmbroseChapel at 3:16 PM on March 7, 2006


OK, I'm wrong, I just tested it in FireFox and it seems that it should replace all whitespace characters, including linebreaks, with normal spaces. Who knew?

And the answer to the other question seems to be, just add yet another line each for &#8220; and &#8221; to your script. Later, rinse, repeat every time it finds a character it can't handle.

Did you tell us yet why we need to fix up these quotes?
posted by AmbroseChapel at 3:25 PM on March 7, 2006


Ahem. "Lather, rinse repeat" of course.
posted by AmbroseChapel at 3:26 PM on March 7, 2006


Response by poster: AmbroseChapel, neither e=e.replace(/&#8220;/g, "'") or e=e.replace(/\&#8220;/g, "'") works — I just tried it. Your advice, then? And, as explained, I'm already doing the Unicode catch in the code by using "\u8220" and "\u8221" earlier in the code.

As for why, yes, I did tell you — in the original question. You really didn't read that big long-ass question, did you? ;-) But I can repeat it again and explain.

I'm using del.icio.us as a linklog publishing system. As such, when I post an excerpt, I want to differentiate it as a quote from the webpage, as a way of differentiating it from my own commentary. I do so by surrounding it with double quotation marks. However, it's standard punctuation rules to then convert any double-quotation marks in the quote to single quotation marks — which is what I'm trying to do.
posted by WCityMike at 4:01 PM on March 7, 2006


neither e=e.replace(/&#8220;/g, "'") or e=e.replace(/\&amp#8220;/g, "'") works — I just tried it.

Just tried it and it works for me. The first one, that is.

Maybe you should be checking for typos, missing semicolons and so on? I hate to say it, but FireFox has great JavaScript debugging built in.

Perhaps you should use it, just for testing anyway, and look at what's coming up in the JavaScript console (Tools menu)?
posted by AmbroseChapel at 6:55 PM on March 7, 2006


Response by poster: Following your advice, works as desired in Firefox, but, interestingly enough, not in Safari!

Nothing useful from the Javascript console. Nothing at all, infact.
posted by WCityMike at 7:59 PM on March 7, 2006


Response by poster: Safari's Javascript console didn't really yield anything after the Javascript was run.
posted by WCityMike at 8:03 PM on March 7, 2006


« Older Ripping Vinyl in OS X   |   I need Thumsup Newer »
This thread is closed to new comments.