Data-getting programming for noobs
February 5, 2013 4:10 PM Subscribe
Can you help me get an idea of what this personal programming project of designing a "data-getting" program (details inside) would involve, given I would be starting from noob level? Alternatives?
I have a pipe dream about writing a program that would create automatically updating Excel files and graphs from free and publically available economic data on the internet for my own use. Specifically, from the US Bureau of Labour Statistics Website but also from the United Kingdom Office of National Statistics website. I would not want to download everything- just certain time series that I'm interested in. The thing is, I am not a real programmer. The only “programming” I have done is writing scripts for my own use in the command based analysis packages, Stata, a little R and Matlab. I have a vague notion that the DIY part might teach me something interesting and useful, but then again it might just put me in over my head.
There are a few things I would like to know:
(i) Is there a commonly understood generic name for this kind of automatic data-downloading and updating program (for specific data series) I have in mind? (ii) Suppose I did set myself this as a sort of personal project. Presumably I would have a lot to learn. What would be on my “syllabus” - in terms of learning the relevant programming and/or other information about accessing the structure in which the data is stored and the automatic updating parts? What could I read and where can I find the relevant information? I would love it if my graphs would update themselves when the statistical offices update and revise their data. I wouldn’t object to using a sort of “template” code but I would want to understand what it was all doing. (iii) And suppose I put in, say, an hour a day. Given my initial noob level of knowledge, any idea of the timescale for such a project? (iv) Suppose I was to abandon the self-study part of my pipe dream. Are there other means of acquiring an “automatic data-getter”, and at what cost? (Other than the mouse /keyboard intensive copy and paste method I use already). (v) Do you think the self-study route for such a narrow primary goal + vaguely defined knowledge satisfaction is definitely foolish or worth a go?
(If it helps my laptop runs on Windows 7.)
posted by mister_kaupungister to computers & internet (12 answers total) 29 users marked this as a favorite
It will be much easier for you to write a program that will output to a comma-separated format (CSV) than an actual .xls excel file. But then you won't be able to have multiple tabs and/or pretty graphs. But you can set those up in an excel file and just import the CSV for each tab that you want. The graphs should update themselves after importing.
You want to look at a simple scripting language (I prefer Python). There are packages that will (a) help you download the website and (b) help you parse it. Knowing regex is useful.
Best of luck! And feel free to MeMail me if that's what you want and you want more details of what I'm talking about.
posted by ethidda at 4:17 PM on February 5 [1 favorite]