Alert me, please!
January 13, 2014 8:39 AM Subscribe
I (well, actually Mrs. jferg) have a website that I need to log into on a regular basis (form-based authentication), check a given page for updates, and possibly perform an action if that update meets certain criteria. The website does not feature a push-type notification (no e-mail, text messaging, or anything else), and I am just a consumer of the site. I would like to have the checking of the site happen in an automated manner (every 10 minutes or something), and send an alert to a mobile device (ideally iPhone, but Android is a possibility). For bonus points this alert could be able to be responded to to trigger the monitor to take appropriate action, but that's a pony feature. What tools are there out there to do this?
I could code something up in Java or perl (or shell+curl/wget) without a ton of effort (I may end up using this as en excuse to pick up Android development again), and use something like prowl/growl (or just text messages) to get the alerts to the iPhone, but I'm really hoping there is something like IFTTT but more complex (and able to monitor arbitrary websites).
The main piece is that the site requires a login, and so whatever site scraper I use has to be smart enough to establish a session, hold on to a session cookie, and pass that session cookie on subsequent requests. (Security of the username/password is not a big concern, so if there's a cloud-based solution that can help me, that's OK.)
I have a hosting account at my disposal, but no virtual server that is conveniently usable, so something that can run on one of my (Windows) desktops at home is probably good enough/ideal for this purpose.
I could code something up in Java or perl (or shell+curl/wget) without a ton of effort (I may end up using this as en excuse to pick up Android development again), and use something like prowl/growl (or just text messages) to get the alerts to the iPhone, but I'm really hoping there is something like IFTTT but more complex (and able to monitor arbitrary websites).
The main piece is that the site requires a login, and so whatever site scraper I use has to be smart enough to establish a session, hold on to a session cookie, and pass that session cookie on subsequent requests. (Security of the username/password is not a big concern, so if there's a cloud-based solution that can help me, that's OK.)
I have a hosting account at my disposal, but no virtual server that is conveniently usable, so something that can run on one of my (Windows) desktops at home is probably good enough/ideal for this purpose.
Response by poster: Sunburnt: Trying to scrape into an RSS feed might be a good end result to facilitate alerting, but the site is not natively RSS-enabled, so I still need something that can deal with logging into the site and scraping the information.
Potomac Avenue: Unfortunately, IFTTT doesn't have the capability of doing complex interactions with arbitrary websites. Otherwise I'd be all over it. I need IFTTT++. :-)
posted by jferg at 11:59 AM on January 13, 2014
Potomac Avenue: Unfortunately, IFTTT doesn't have the capability of doing complex interactions with arbitrary websites. Otherwise I'd be all over it. I need IFTTT++. :-)
posted by jferg at 11:59 AM on January 13, 2014
I think that you are going to have to create the RSS scraper, or hire one that you can program to scrape the site in question, within the constraints of security of the site. I'm not sure how easy that is with a login requirement, but I can't imagine you're the first person with this problem.
Maybe i'm using the term scraper wrong: you want to generate the RSS feed yourself, and then you can use existing RSS tools to handle notification.
posted by Sunburnt at 12:43 PM on January 13, 2014
Maybe i'm using the term scraper wrong: you want to generate the RSS feed yourself, and then you can use existing RSS tools to handle notification.
posted by Sunburnt at 12:43 PM on January 13, 2014
If you're semi-comfortable with setting up a system to do this, look into test automation tools, like Selenium. Some of them are made for non programmers, so you can easily create a script that logs in a web site, checks the page for a given section, and have it alert you in some way.
posted by jsmith77 at 5:35 PM on January 13, 2014
posted by jsmith77 at 5:35 PM on January 13, 2014
I could code something up in Java or perl
It's been a looooong time since I used it, but perl's WWW::Mechanize is designed for exactly what you want to do, can handle session headers and cookies and etc, and is fairly straightforward to use.
posted by ook at 9:50 PM on January 13, 2014 [1 favorite]
It's been a looooong time since I used it, but perl's WWW::Mechanize is designed for exactly what you want to do, can handle session headers and cookies and etc, and is fairly straightforward to use.
posted by ook at 9:50 PM on January 13, 2014 [1 favorite]
« Older Midlife crisis: Suggestions for a career path out... | How to ask for a playdate without putting the... Newer »
This thread is closed to new comments.
Searching for "rss scraper" displayed other searches which amounted to "rss scraper [language]" so that's where I'd go in your shoes.
posted by Sunburnt at 10:06 AM on January 13, 2014