Alert me, please!
January 13, 2014 8:39 AM   Subscribe

I (well, actually Mrs. jferg) have a website that I need to log into on a regular basis (form-based authentication), check a given page for updates, and possibly perform an action if that update meets certain criteria. The website does not feature a push-type notification (no e-mail, text messaging, or anything else), and I am just a consumer of the site. I would like to have the checking of the site happen in an automated manner (every 10 minutes or something), and send an alert to a mobile device (ideally iPhone, but Android is a possibility). For bonus points this alert could be able to be responded to to trigger the monitor to take appropriate action, but that's a pony feature. What tools are there out there to do this?

I could code something up in Java or perl (or shell+curl/wget) without a ton of effort (I may end up using this as en excuse to pick up Android development again), and use something like prowl/growl (or just text messages) to get the alerts to the iPhone, but I'm really hoping there is something like IFTTT but more complex (and able to monitor arbitrary websites).

The main piece is that the site requires a login, and so whatever site scraper I use has to be smart enough to establish a session, hold on to a session cookie, and pass that session cookie on subsequent requests. (Security of the username/password is not a big concern, so if there's a cloud-based solution that can help me, that's OK.)

I have a hosting account at my disposal, but no virtual server that is conveniently usable, so something that can run on one of my (Windows) desktops at home is probably good enough/ideal for this purpose.
posted by jferg to Computers & Internet (6 answers total) 1 user marked this as a favorite
I think you'd want an "rss scraper" to create your own RSS feed. From there, there must be some good notification options available for RSS in general. This would be good because you can also, possibly, scrape the content which may mean you don't have to log in manually to get the content.

Searching for "rss scraper" displayed other searches which amounted to "rss scraper [language]" so that's where I'd go in your shoes.
posted by Sunburnt at 10:06 AM on January 13, 2014

If this then that?
posted by Potomac Avenue at 11:21 AM on January 13, 2014

Sunburnt: Trying to scrape into an RSS feed might be a good end result to facilitate alerting, but the site is not natively RSS-enabled, so I still need something that can deal with logging into the site and scraping the information.

Potomac Avenue: Unfortunately, IFTTT doesn't have the capability of doing complex interactions with arbitrary websites. Otherwise I'd be all over it. I need IFTTT++. :-)
posted by jferg at 11:59 AM on January 13, 2014

I think that you are going to have to create the RSS scraper, or hire one that you can program to scrape the site in question, within the constraints of security of the site. I'm not sure how easy that is with a login requirement, but I can't imagine you're the first person with this problem.

Maybe i'm using the term scraper wrong: you want to generate the RSS feed yourself, and then you can use existing RSS tools to handle notification.
posted by Sunburnt at 12:43 PM on January 13, 2014

If you're semi-comfortable with setting up a system to do this, look into test automation tools, like Selenium. Some of them are made for non programmers, so you can easily create a script that logs in a web site, checks the page for a given section, and have it alert you in some way.
posted by razdrez at 5:35 PM on January 13, 2014

I could code something up in Java or perl

It's been a looooong time since I used it, but perl's WWW::Mechanize is designed for exactly what you want to do, can handle session headers and cookies and etc, and is fairly straightforward to use.
posted by ook at 9:50 PM on January 13, 2014 [1 favorite]

« Older Midlife crisis: Suggestions for a career path out...   |   How to ask for a playdate without putting the... Newer »
This thread is closed to new comments.