Creating a custom search engine from a set of bookmarks?
December 12, 2006 7:17 AM
Trying to turn a set of bookmarks (del.icio.us) into a custom search engine.
I want to be able to not just search my bookmarks, but search the content of my bookmarks - and not just all of my bookmarks, but sets of my bookmarks defined by tags.
Rollyo - doesn't allow more than a few urls
I think Google Co-op might be the solution, but I can't figure out how to import the bookmarks into the engine except manually. Once they are in, the search engine is highly customizable, but it would be best if it could be done on the fly.
Is there another option? Maybe this cannot be easily done with the del.icio.us export options.
I want to be able to not just search my bookmarks, but search the content of my bookmarks - and not just all of my bookmarks, but sets of my bookmarks defined by tags.
Rollyo - doesn't allow more than a few urls
I think Google Co-op might be the solution, but I can't figure out how to import the bookmarks into the engine except manually. Once they are in, the search engine is highly customizable, but it would be best if it could be done on the fly.
Is there another option? Maybe this cannot be easily done with the del.icio.us export options.
I use blinklist, so I'm not sure if this would work for del.icio.us, but it may be worth a try if it supports creating RSS feeds out of your tagged links.
I subscribe to some of my blinklist feeds as live bookmarks in firefox. I can easily search the contents (actual pages) of all of my bookmark feeds using the FF extension All-In-One Sidebar.
So what I normally do is get my feeds by tag on blinklist, (unix, java, etc), subscribe to those feeds as a Live Bookmark and search as needed.
Convoluted? Yes .. but it works for me.
posted by duckus at 9:39 AM on December 12, 2006
I subscribe to some of my blinklist feeds as live bookmarks in firefox. I can easily search the contents (actual pages) of all of my bookmark feeds using the FF extension All-In-One Sidebar.
So what I normally do is get my feeds by tag on blinklist, (unix, java, etc), subscribe to those feeds as a Live Bookmark and search as needed.
Convoluted? Yes .. but it works for me.
posted by duckus at 9:39 AM on December 12, 2006
On additional consideration, I'm pretty sure this could be done as a Greasemonkey script that would inject a customized Google search field into a delicious page.
posted by adamrice at 11:41 AM on December 12, 2006
posted by adamrice at 11:41 AM on December 12, 2006
DevonAgent for mac can do this with a custom plugin. I've uploaded a version of my plugin (1KB) but you will need to edit some of the fields in a text editor or Apple's Property List Editor. Just replace "username" with your del.icio.us username. If you read the documentation, it shouldn't be too hard.
Another option would be to use wget with appropriate options to avoid triggering yahoo's spam/slam protection. For example:
wget -r -H -l 2 -w 3 --tries=3 -R "*Cbra*" \
--exclude-domains "del.icio.us" \
http://del.icio.us/Cbrachyrhynchos/mac
posted by KirkJobSluder at 11:59 AM on December 12, 2006
Another option would be to use wget with appropriate options to avoid triggering yahoo's spam/slam protection. For example:
wget -r -H -l 2 -w 3 --tries=3 -R "*Cbra*" \
--exclude-domains "del.icio.us" \
http://del.icio.us/Cbrachyrhynchos/mac
posted by KirkJobSluder at 11:59 AM on December 12, 2006
The logic behind that wget commandline includes some things not documented in the man page:
posted by KirkJobSluder at 12:22 PM on December 12, 2006
-r -H -l 1
recursively go two links deep off the starting URL, to other sides as needed.-w 3 --tries=3
wait 3 seconds between requests and give up after 3 tries. -R "*Cbra*"
just a short version of my username to avoid spidering to other tags.--exclude-domains "del.icio.us"
probably makes -R redundant. This prevents wget from spidering back to del.icio.us for the login page.url
the url for your tags page. Although you might want to use, "http://del.icio.us/username/tag?setcount=100&page="{1,2}
posted by KirkJobSluder at 12:22 PM on December 12, 2006
I think this is sort of what spurl does. They save a copy of your bookmarks, and make the content of all your bookmarks searchable.
I don't know how powerful the search is, but you can import from del.icio.us, so it would not be too much work to try it out.
posted by davar at 1:28 PM on December 12, 2006
I don't know how powerful the search is, but you can import from del.icio.us, so it would not be too much work to try it out.
posted by davar at 1:28 PM on December 12, 2006
Thanks for the input. KJS - your approach sounds interesting, but I know very little about programming and I don't have access to a mac (at least, I think you mean that your code would need to run on a mac.)
Doing some searching on my own, I was able to find searchfeedr. It is the closest thing I have found to what I'm looking for. It searches over delicious tags, feeds, or links on a webpage in a variety of engines. It even includes the option to exlcude urls.
It still would be great to find an easy way to import delicious bookmarks into Co-op, though. Under advanced settings, it says that they allow OPML files to be uploaded and the same person behind searchfeedr has a handy delcious2opml script. I've tried saving the results as .xml (not sure if that is right), but uploading it results in error messages.
Spurl looks interesting, but it only searches the pages you have saved. In Co-op, you can specify to search by url or page. Actually the only problem with searchfeedr for me is that search by url doesn't include subdirectories - so search rather than searching a small subset of a website, you end up either searching or including the entire thing.
Vertical search seems to be the up and coming thing, so perhaps this won't be a problem in the near future.
posted by imposster at 8:21 PM on December 12, 2006
Doing some searching on my own, I was able to find searchfeedr. It is the closest thing I have found to what I'm looking for. It searches over delicious tags, feeds, or links on a webpage in a variety of engines. It even includes the option to exlcude urls.
It still would be great to find an easy way to import delicious bookmarks into Co-op, though. Under advanced settings, it says that they allow OPML files to be uploaded and the same person behind searchfeedr has a handy delcious2opml script. I've tried saving the results as .xml (not sure if that is right), but uploading it results in error messages.
Spurl looks interesting, but it only searches the pages you have saved. In Co-op, you can specify to search by url or page. Actually the only problem with searchfeedr for me is that search by url doesn't include subdirectories - so search rather than searching a small subset of a website, you end up either searching or including the entire thing.
Vertical search seems to be the up and coming thing, so perhaps this won't be a problem in the near future.
posted by imposster at 8:21 PM on December 12, 2006
This thread is closed to new comments.
I know the guy behind http://www.givemebackmygoogle.com/ is looking for interesting ways to customize his site—maybe you can suggest this delicious mashup to him.
posted by adamrice at 8:21 AM on December 12, 2006