What language to write a multi-search utility in?
April 4, 2011 9:06 AM   Subscribe

I want to write a tool for multiple-searching a site that only supports single-searches. What's the best implementation language?

The site I want to search is an aggregator of online stores. Each search searches all their affiliated stores and lists which stores carry your desired product, including quantity and price, but only for 1 item. I want to be able to search for multiple items and see a result that lists which store has how many of each of the items I searched for, so that I can hopefully buy all of them from 1 store and save on shipping.

My question is: what is the best tool for doing this task (on windows)? I have a programming background, but little exposure to web scripting. I'd like this to run locally and not require a web server. I don't need a fancy UI, just a data grid would be fine. I don't mind learning a new language to do this.

I have access to VS.NET, in case that's one of the better options.

I know that there are probably many tools that allow one to do this sort of thing. If you can mention why you think any given one is a good selection that'd help.

If anyone is interested I can post the site. I hunted around and couldn't find an existing script for it.

I'm not planning on making the script publicly available, and my searches would be at most 20 items, so I'm not concerned that the site will find my usage objectionable, assuming that I code it well.
posted by Four Flavors to Technology (7 answers total)
 
Best answer: You want to code a desktop add for windows? C# is pretty much the way to go. Sane syntax, pretty powerful and lots of community support wrt to docs, support and code. You will want to use as much of existing libraries as possible for the scraping, e.g. HTML agility pack.
posted by Foci for Analysis at 9:23 AM on April 4, 2011


Response by poster: It doesn't have to be a desktop app- it could run in a browser. That pack looks useful, better than this, which is what I found when searching so far.
posted by Four Flavors at 9:41 AM on April 4, 2011


Pretty much any language can do this, I don't think there's a right answer here. Use what you're comfortable with.
posted by empath at 11:09 AM on April 4, 2011


I wouldn't have thought it was worth learning a whole new language for a one-off small script for occasional personal use, and which doesn't need to much of an interface. So unless you'll frequently want to make such things or like the idea of learning something new anyway, it's probably good to stick with what you already know.

But for what it's worth I use Ruby scripts for little tasks like this. It is very easy to do things like fire off a web query and scan the response for the info you were looking for. Also the development cycle is very fast.

For something like this I wouldn't probably wouldn't even create an interface, just have a list of items I want in the script, and then edit that every time I want to search the shop.
posted by philipy at 11:14 AM on April 4, 2011


Best answer: You could implement this in a lot of ways, but it sounds like a one-off that's not worth a ton of effort. I'm assuming you have verified that there is no JSON/SOAP/etc API for the site in question?

For a one-off intended for human viewing, I'd probably program it as a greasemonkey script. If browser viewing doesn't work, use a C#/Perl/Python/Ruby program if you want to run it at scheduled intervals and slurp the data into something else. As empath said, anything that has a decent web scraping library will work.
posted by benzenedream at 11:25 AM on April 4, 2011


Response by poster: How would I go about finding out if the site publishes and API? If I could query their data directly it would certainly save a layer.

I do know that there's a search service that allows more complex single-item queries that displays the results from the aggregator site in a table that it gets by running javascript on the aggregator:

[script type="text/javascript" src="http://partner.example.com/syn/Synidcate.ashx?pk=type&pi=details"></script]
posted by Four Flavors at 11:48 AM on April 4, 2011


How would I go about finding out if the site publishes and API? If I could query their data directly it would certainly save a layer.

There is no generic way of determining that for any given site. Best case is checking with each site to find out if they support it.
posted by mmascolino at 12:29 PM on April 4, 2011


« Older Fresh grad job hunting--help!   |   Obscure NBA rule question regarding timeouts Newer »
This thread is closed to new comments.