Help me feed my addiction I keep forgetting about!
January 14, 2011 12:46 PM   Subscribe

How do I automatically download a file from a website (specifically, the NYTimes crossword puzzle) every day?

So I have a subscription to the NYT crossword, which is something that I really enjoy having -- when I remember to go and download the puzzle every day. Is there any way that I can automate this process? I'm on OS X (10.6.6), and generally use Firefox, if it matters. I also have Chrome and Safari.

I found nonfunctional download links for an old Windows program called Puzzle Print 1.2, so I know that this is theoretically possible -- but how?
posted by naturalog to Computers & Internet (6 answers total) 4 users marked this as a favorite
 
curl and a cron job I guess.
posted by GuyZero at 12:55 PM on January 14, 2011


I use cron + curl to do similar things. Both are freely available for OSX, and might even be included by default. There are some challenges though:

1. You'll need to not screw up cron, like I do on occasion. There are cron generators, if you want to be sure.
2. I think you'll need to provide authentication, if you're a paying member. The options are there within curl (documentation), but knowing what to provide may be difficult.

Alternatively, you could cook up a Selenium script to run daily via cron. Use selenium IDE to record the login and save / printout process, and cron to launch the recording daily.
posted by pwnguin at 1:00 PM on January 14, 2011


To expand on GuyZero's answer:

If the crossword has the same URL everday (http://nyt.com/xword.jpeg), the cURL command to download it would be

$curl http://nyt.com/xword.jpeg -O

To run it daily, you'll want to use a command line utility called "cron" that's already on your computer. I haven't used it before, but here's a website that should get you going.
posted by auto-correct at 1:03 PM on January 14, 2011


The problem you'll run into is the link is not always the same. It's of the form: http://select.nytimes.com/premium/xword/Jan1411.puz, so you'll need to assemble the link name dynamically. Alternatively, if you're comfortable with a bit of programming, you could use Python's URLlib (or 2) and Beautiful Soup to grab that link (which always has the text "Today's Puzzle" in the link and exists inside a div element with the id of "todayPuzzle"). Python's not your only option and Beautiful Soup has been ported to other languages.

I feel like someone must have built a tool to do this kind of thing. And then it occurred to me you are going to have to go an extra step because the crossword requires a login to get to it (or at least it used to), so you'll need something like cURL or URLlib to pass credentials while requesting the link.
posted by yerfatma at 1:08 PM on January 14, 2011


Best answer: Here are instructions on how to set this up using Automator on a Mac.
posted by misterbrandt at 1:17 PM on January 14, 2011 [3 favorites]


Response by poster: Aah, thank you, misterbrandt! Not sure how I didn't find this while googling...
posted by naturalog at 1:48 PM on January 14, 2011


« Older Catalytic Converter went in my 2005 Honda Civic...   |   Sleeping arrangements for a two year old away from... Newer »
This thread is closed to new comments.