Daily digest for passive research on multiple topics...
October 8, 2013 12:58 PM   Subscribe

I want to track a variety of topics passively using services on the interwebs. Google Alerts, RSS Feeds, and IFTTT are not solving the problem (or rather, I am not sufficiently solving the problem with them). Help me use available services to get to this promised land.

I am trying to put together a (probably rube goldberg-ian) system that spots mentions of certain terms in certain publications, bundles up those publications (blog posts, for example), and sends them to me daily.

For example, let's say I was interested in what was happening with gold markets (this is not the case, but it is parallel). I would like to receive an email at the end of the day containing any posts (links to posts would be fine) among say 4 or so specified commodity investing blogs that mentioned "gold" that day.

Bonus difficulty: I want to be able to add terms to the list with relative ease. So maybe after a week of getting this daily digest, I also want to include "silver" and "frozen orange juice concentrate" but without a huge rigmarole of re-setup. By the way, I don't care if I get a post twice because it mentioned both gold and silver.

In Unix, this would be a cron job where you would sort of


-Google Alerts does a great job finding the term in the entire internet and emailing it
-IFTTT does a great job finding a specific term in a single feed and doing -something- but it cannot take in programmatic inputs for this search (or trigger off of a spreadsheet)
-RSS is good at defining the feeds to search (though, I don't know how to combine multiple channels in to a single RSS feed, that would be helpful). Maybe other RSS tools can do searching/parsing?
-I do not know of a batching/caching service that will collect data over the course of a day and deliver it in one "box"

In an ideal world I would have (I think):
-A master RSS that was the cumulative feed of my 4 or so sources
-An IFTTT recipe that triggered when the feed matched entries in
---A google spreadsheet that contained my terms
-That recorded the content of the posts -somewhere-
-That would be emailed to me at 5pm every day

The scenario I have now is a team of IFTTT recipes that trigger by individual feed and by individual term that email me when they occur (so setup for new terms is a pain and I get a barrage of emails instead of one super email).

The three hurdles are (I think):
-Having a super feed of the 4 or so feeds (maybe this is easy... it should be but maybe I am dumb/ignorant)
-Having the topic list exist on its own such that the process refers to it independently (as opposed to being baked in to a recipe or google alert, for example)
-Pooling up the matches to send a single digest email daily

Any solutions to any of these problems, new services to investigate, different ways to think about the problem, or things I may have overlooked would be very much appreciated!
posted by milqman to Computers & Internet (6 answers total) 4 users marked this as a favorite
Couple of questions.

Is there a reason for the email part of this? Not that I disagree with using it, just wondering if you had a particular need there.

Instead of the single daily email approach (which seems to up the complexity factor), what if you created a mail rule (or rules) that would automatically shunt anything from specific senders into a folder, which you would then peruse at your convenience, deleting or archiving as desired?
posted by Celsius1414 at 1:31 PM on October 8, 2013

If I were figuring out how to do this in Python, I'd have a cron/task scheduler job run a daily script that read your current search terms from your spreadsheet, compose the search query from your terms, return relevant results (with some checks in there for dupes & a diff from previous days*) into a different document (named e.g. <datestring>), and did whatever with that. Agree with above comment regarding emailing creating another degree of complexity, but automating an "attach to email and send" really should be able to be handled programmatically.

* this is where I think it would ultimately get gummed up - Google results are forever. Do you just want novel stuff? You can either compare it to everything you've ever collected or figure out how to use a date delimiter.

I have not tried to grind out a working .py script to do this, because while I'm a strong researcher and a good analyst, I'm a crap programmer. Have you tried TrapIt?
posted by Emperor SnooKloze at 1:46 PM on October 8, 2013

Sorry to threadsit!

Substituting a batched email with a similar functionality like email filtering is a great idea (and exactly the kind of thing I am looking for).

I think this effectively solves the third of my three hurdles.
posted by milqman at 2:20 PM on October 8, 2013

I had another thought that I'll share anyway. IFTTT can post to Google Docs. So you could create a spreadsheet that collects posts from RSS feeds. It could capture the post title, date, & a link. Not exactly a batch notification, but if you check the spreadsheet daily you get all of that day's posts.
posted by Tehhund at 3:05 PM on October 8, 2013

I am probably the only one that cares but for the sake of posterity... after some heartache, I rediscovered Yahoo Pipes. Yahoo Pipes and IFTTT together provide a workable solution.
posted by milqman at 3:17 PM on October 10, 2013

I care milqman... it would be good to hear how you've made Yahoo Pipes and IFTTT work together, as I'd like to do something similar. Many thanks.
posted by Speculatist at 11:56 AM on October 12, 2013

« Older ZagrebFilter: What should we do in Zagreb?   |   How do I quickly hire someone to sketch a line... Newer »
This thread is closed to new comments.