Formatted movie data availability?
January 10, 2007 3:39 PM   Subscribe

I'm working on a site that requires a movie database. Help me find formatted data!

I've spent quite a while researching different ways of obtaining a complete movie database. IMDB and the All Media Guide offer paid services (approximately $3K/month) and would suit my needs perfectly. However, I cannot afford this, so my current plan is to create a fully functional demo site that I can show to investors to try to raise the money needed to purchase a subscription to the previously mentioned data.

The question: Assuming there is no free database of movies available, where can I find a formatted (xml/csv/etc) large set (or subset, doesn't need to be complete) of movie/actor/director data? This data will be used only to show my site to investors and will not be made public.

(Note: IMDB has data available for "personal use". This data is in the form of text files and doesn't seem to be in an easily parseable format.)
posted by null terminated to Computers & Internet (11 answers total) 1 user marked this as a favorite
 
Have you considered using amazon's api to start with? It (their DB) may not be complete, but it is huge.
posted by IronLizard at 4:15 PM on January 10, 2007


Response by poster: IronLizard: That was actually my first approach. It looks like they limit requests to 10 movies per call...I may have to end up using this but I'd like to find something more convenient. I also want to separate the idea of the movie itself from different versions of the movie on dvd. For example, using Amazon's API there's no way to tell that a widescreen and fullscreen movie are the same.
posted by null terminated at 4:19 PM on January 10, 2007


In that case you are in way over my head, sorry.
posted by IronLizard at 4:27 PM on January 10, 2007


Could you post a link to the IMDB "personal use" data (or post a tiny sample).

When I last saw it (I've lost my own bookmark to it) I thought it was pretty easy to deal with - isn't it in the form of a CSV ?

Python has some great facilities for parsing CSV - so great I might be willing to start you off with a script (written in my copious free time ;-) that you could use as a basis for pulling whatever you like

(I suppose I should say it's > 1 year since I looked so my memory of what format it's might be wonky)
posted by southof40 at 4:31 PM on January 10, 2007


Response by poster: southof40: This is what I was referring to.

Anything in CSV format would be perfect.
posted by null terminated at 4:33 PM on January 10, 2007


OK will take a look, have to go and look after the offspring now - will respond in approx 12 hours
posted by southof40 at 5:03 PM on January 10, 2007


No, the downloadable lists from IMDB are not in CSV format.

They are plain text files and the lists of movies are like this:
Reservist Before and After the War, A (1902)		1902
Reservoir Bitches (1994) (V)				1994
Reservoir Dogs (1992)					1992
Reservoir Dogs (2006) (VG)				2006
Reservoir Guide Dogs (1995)				1995
Reservoir Wolves (2001)					2001
Reservoirs of Strength... A Burn Recovery Film (1990)	1990
which if you look at the source, is something like

Title, optional article, space, year in brackets, optional notation for things like '(VG)' == 'video game', as many tabs as it takes to make it line up, year again.

What a mess. Plus there are all kinds of weird things going on with quotes and brackets when it comes to TV episodes.
posted by AmbroseChapel at 5:31 PM on January 10, 2007


Having said that of course, if you wanted just to extract titles and years, that could be grepped out for you very easily indeed.
posted by AmbroseChapel at 5:38 PM on January 10, 2007


Best answer: Does your demo have to display real movie data? How about generating your own dataset of fake movie titles, actors, directors, etc?
posted by nakedcodemonkey at 5:42 PM on January 10, 2007


Best answer: Does this list (free for noncommercial use) contain enough information for your purposes?

http://www.hometheaterinfo.com/dvdlist.htm

It's not all movies, but it is movies available on DVD.
posted by xiojason at 7:29 PM on January 10, 2007


Response by poster: Thanks for the help. I don't believe I'll be using the "generating your own dataset" tool for this, but it has been bookmarked and will definitely be useful the future.

xiojason: Since I last saw this site it looks like they added some director and actor information, so that should give me plenty to create a demo site with.
posted by null terminated at 8:29 PM on January 10, 2007


« Older Detecting coffee spiked with xanax?   |   How safe is Downtown Denver? Newer »
This thread is closed to new comments.