Roll-your-own podcast downloading
January 4, 2017 1:57 PM   Subscribe

I'm looking for a piece of software that will monitor a webpage, scan it for URLs of a particular form, and whenever a new such URL is posted, download the MP3 file that this URL points to. In an ideal world, this MP3 file would be added to my iTunes library as well, but it's not strictly necessary. Does such a tool exist for Mac OS?

Background: recently, NPR stopped supporting several of its "legacy" podcast feeds, including the Sunday Puzzle segment with Will Shortz. (I confirmed this via e-mail with NPR tech support.) The MP3 files of the segments are still posted to the NPR website on a weekly basis, and could be downloaded manually if need be; but I would really prefer to have some way have them show up in iTunes the way they use to, or at least end up on my hard drive somehow.

I have no idea whether this kind of software exists, and if so what it would be called. I'm not averse to learning the basics of some scripting language, poking around in HTML source code, or opening up the Unix terminal if need be; I just don't know where to start, and this seems like a task that someone would have already developed tools for.
posted by Johnny Assay to Computers & Internet (4 answers total) 5 users marked this as a favorite
Huffduffer is a web service that adds MP3 files you find on the internet to an RSS feed you can then follow in iTunes or your podcast app of choice—it's like building your own podcast feed. You just click the bookmarklet, make sure it found the right MP3 file on the page, and go from there.

(So far as I can tell you'd still have to go to the page where they have the MP3 file every week, but you wouldn't have to manually download it or manually make sure it's available to you from iTunes.)
posted by Polycarp at 2:04 PM on January 4, 2017 [4 favorites]

You might be able to use Feedburner for this too. It's an old-as-hell tool (now owned by Google) that turns most blogs or other regularly-updated content into RSS feeds, which you can then plug into your feed-reader, pod-catcher, etc. It's handy for that small handful of podcasts that don't have directly accessible RSS feeds, but may not be a permanent solution, given Google's propensity for shuttering things when they remember they exist.
posted by Strange Interlude at 2:06 PM on January 4, 2017

I agree with you; there has to be software out there which does this... but I don't know it! Since you said you're OK with DIY solutions:

if the URL to the latest mp3 is always the same, or if it just has an incremented number at the end or is based on the current date or something else simple and regular, you could write a quick shell script that uses something like curl (which is pre-installed on your Mac) to grab the mp3 and then runs a command to add it to iTunes. You can then schedule the script to be run on a weekly basis using launchd or cron (both of which are also pre-installed on your Mac).

If the URL to the mp3 isn't so simple, but a link to the file is always found in the same place on the same webpage, you can still do this, your script just has to first download the link-containing webpage using curl, then use something like grep to search through the HTML source for the link, then finally curl the mp3 and add it to iTunes (and ideally trash the downloaded webpage so it's not cluttering up your computer).

Of course, this will probably take a little while to throw together if you don't know the tools already, but if you're comfortable in the terminal and know HTML you can definitely do it, and they're good tools to know!
posted by acroyear2 at 3:10 PM on January 4, 2017 [1 favorite]

Thanks for your help, everyone! I finally ended up using the shell script method (HuffDuffer still required me to visit the webpage each week, and FeedBurner didn't parse things correctly to get me the MP3 files.) Here's the script I ended up with, which should work until NPR redesigns their webpage or iTunes redesigns their directory structure:

# Download Sunday Puzzle webpage
curl --no-progress-bar --fail -o "temp_puzzle.html"
# Search webpage for most recent URL
PUZZLE_URL=$(grep -o "https:\/\/\/.*\.mp3" temp_puzzle.html -m 1)
# Retrieve current puzzle MP3 and save in date-stamped file
DATE=$(date +%y-%m-%d)
curl $PUZZLE_URL --no-progress-bar --fail -o "NPR_puzzle_audio_$DATE.mp3"
# Move file to iTunes
mv NPR_puzzle_audio_$DATE.mp3 ~/"Music/iTunes/iTunes Music/Automatically Add to iTunes/"
# Clean up
rm temp_puzzle.html

This only downloads the most recent puzzle audio file, but could probably be modified to automatically download multiple audio files. (You'd have to be careful combining that with a cron job, though.)
posted by Johnny Assay at 6:43 AM on January 16, 2017 [1 favorite]

« Older Bernd das Brot merchandise in Berlin   |   Apps for communicating with deaf friend? Newer »
This thread is closed to new comments.