Concrete steps from academic economics into data science/management
April 4, 2014 5:07 AM   Subscribe

I would like a job that allows me to write programs or scripts for data processing/analysis. I am trying to understand what a move from academic economics to data-science/management would involve for me in terms of specific steps and goals. What would I need to learn, and what should I prioritise learning? Should I study at home or take a course? When (i.e. after how much or what sort of study/self-study etc) should one think about actually applying for those sorts of jobs?

I am about to graduate with an economics PhD. The PhD didn’t go well, and I have more or less decided that academia isn’t for me. During my PhD all I really enjoyed was writing scripts for data processing and for econometric analysis (in Stata and Matlab). You may remember be from a recent anonymous Askme. There I received some advice to the effect that there are jobs which would allow me to do precisely this. I’ve also been hearing about things like data analyst, data science, business analyst. [I have also read some other Askmes about this subject.]
My question is, how would I go about pursuing this in the short term, given my particular circumstances? In May I will have no job and about £5000 (around $8,300 USD). I know a little Python, and the above statistical software languages. I could last a few months of intensive self study at something but I would soon need income coming in. Although I can write scripts in the aforementioned packages, I am of no more than average computer literacy otherwise – my academic background is all economics and econometrics.)
My current ideas are: get better with Python and complete my data-scraping project (which I asked about here). (I have started this and still intend to finish it, but my progress has been greatly slowed down by having to finish my PhD and teaching duties (all coming to an end soon). As part of the project I decided to learn Python at Codeacademy first and have been doing that.) But I am also getting the impression that knowing SQL is really useful – I have seen it as a job requirement in some adverts and a friend got work at a utility company based on those. I know a bit of R, and could work on that for the analysis side of things.
I would like to know, for the purposes of pursuing this data-job idea: what it would be best to prioritise in self study (what languages, projects etc), whether it would be a good idea to take a formal course in something, what sort of length of re-education (self study or course-based) this would involve before it would be worth applying to companies? If so, what sort of companies? I am thinking I might just get work wherever I can for the time being, and take a while to build up my skills at home in my spare time. Would it be a good idea to build a “code portfolio”? Is there a sort of skills ladder for these kinds of jobs, so one could go in at a lower ‘rung’ with perhaps only a few months of intensive retraining, and then improve skills while gaining on-the-job experience and getting paid?
Tldr; I am trying to understand what a move from academic economics to this data-computing area would involve for me in practical terms.
For context: I am late 20s and UK based [not London] but willing to move, with no dependents.
posted by mister_kaupungister to Education (7 answers total) 20 users marked this as a favorite
 
You might be employable right now in Data Analysis.

There are entry-level Data Analyst jobs out there, here are the qualifications for one:

Qualifications

Education and/or Experience
• Bachelor’s degree in related field preferred.
• One (1) year work experience and knowledge of data quality assurance, data analysis, and data modeling required.
• Knowledge and understanding of one or more database management systems preferred.

Other Skills and Abilities
• Must have a solid understanding of logical database design principles.
• Understand the methodologies and technologies that depict the flow of data within and between technology systems and business functions/operations.
• Analytical thinking; accuracy and attention to detail skills.
• Written and verbal communication skills; interpersonal skills.
• Ability to establish and maintain effective working relationships with team members.
• Other related skills and/or abilities may be required to perform this job.


A lot of Data Analysis jobs need regressive analytics, but use proprietary systems, you won't need to now the actual systems the company uses per se, as you'll learn them on the job while working with them.

You'll make WAY more dough than you ever would in academia, plus, it's hella fun playing with data.
posted by Ruthless Bunny at 5:45 AM on April 4, 2014


Hi.

I have been taking a bunch of free courses on Coursera in data science, python, R, etc. For 50 bucks you can get a certificate for completing the course, if that is what you think you want/need.

I think the PhD is going to open a lot of doors for you. You seem to be selling yourself short here.

Also, I don't know but I have heard that you can learn the basics of SQL really quickly. Like, you can put it on your resume and in the time between when you get called for an interview and actually have the interview, you can learn SQL enough to talk intelligently about it.

I believe there is a short O'Reilly book on SQL that people have recommended to me in the past.
posted by MisantropicPainforest at 6:06 AM on April 4, 2014 [2 favorites]


Maybe check out Insightdata?
posted by paper chromatographologist at 6:12 AM on April 4, 2014


Just for clarification, SQL encompasses two different things. One is Transact SQL which are statements to read/write/modify the database from a calling program. If you use Access, for instance, a T-SQL statement is being used under the covers. The other is SQL proper which is the language of programs executed within the database. Learning enough T-SQL to retrieve data into stat package is pretty easy. Data manipulation within the database is harder, or at least, has a longer learning curve.

The logic used in SQL is very different from the logic you would use in a common precedual language like C or Java due to the power of the built-in functionality, and it can be difficult for a beginner.

At one point in my career, I thought that learning SASS would help me find work. I spent a month with a tutorial, and passed the test to get a certificate. I found a local society of SASS consultants and attended a function or two. Although, I ended up going in a different direction, it was pretty clear to me that I would have found SASS work in time.

The level of proficiency required to get a certificate from one of the software companies is pretty low. They have more weight with HR than with the IT Director, but they can open the door.
posted by SemiSalt at 7:11 AM on April 4, 2014 [2 favorites]


Best answer: So I just talked with the people at Insightdata about doing this from psychology. What you've done is essentially what they recommend. I'll mefi-mail you the list they sent me. I do know that during the talk with them, they mentioned that there is a similar program to them in London, so I would also look into that.

In general though, they stressed that having a project to show at interviews, and showing a willingness to learn new techniques is more important than being proficient at any of the particular skill sets.
posted by katers890 at 9:38 AM on April 4, 2014 [2 favorites]


Best answer: I just hired for a similar position to what you are looking for, and saw quite a few people with masters/PhDs in statistics/math applying.

One of the things that stood out to me about these candidates is they didn't have enough experience with messy and incomplete data, or with the various tools used to collect and aggregate that data. It appears as though for a lot of their work they'd just been given data sets.

Depending on the size of the organization you find, you may be responsible for acquiring, cleaning, aggregating and standardizing the data you find, not just analyzing. So I would spend some time getting an understanding of what's called ETL (extract, transform, load) processes and data lifecycle stuff.

Also, for sure learn some SQL. It's pretty easy. And the logic to it satisfying in a particular way.

Two other thoughts: One, the ability to communicate is huge. Your job is to distill complex data into knowledge. Make sure you have that nailed. Also, you're not a passive conduit. Understand the processes, have an opinion on what the data says.
posted by Sleddog_Afterburn at 1:39 PM on April 4, 2014 [4 favorites]


Best answer: I'm a data analyst in the private sector and I love my job. None of the companies I've worked for have done the kind of work at the kind of scale where we would need, or even be able to use, the skillset brought by a PhD, so I'm not sure if what I have to say will be relevant. Nevertheless, here are my thoughts; I'll try to provide enough context so you can identify any parts that may be helpful for you.

First, I enthusiastically agree with everything that Sleddog_Afterburn said above.

Ok. So, you'll probably be looking for a company with a large-ish analytics department that can support you as a customer of the people working directly with the data warehouse. This might be a large company, or it might be a specialized consulting-type place, but before anything resembling econometrics can happen there's data-architecture, -engineering, and-manipulation work that needs to take place, and after the analysis comes the visualization and communication required to turn it into "actionable intelligence," as they like to say.

My favorite former coworker was a very smart and experienced economist and SAS adept. He had a lot of trouble producing valuable work (as part of a small 3-person analytics department) because he was too specialized in the analysis itself and struggled when intermediate-level-or-above SQL was needed to prepare his data sets. (Also, we never got him a SAS license, because they're super expensive.) When it came to visualization, I was too often the only one who could read what he produced--the sort of R plots that you'd expect in an academic journal, or ANOVA tables, and so on. No product owner or executive ever asked him how many distributions he fit, or what the R-squared was. No one could appreciate how solid his work was, and he was routinely frustrated because he couldn't make the kind of contribution he would have been capable of, had the company been such that it could support a higher level of specialization.

That trap is, I would guess, your biggest risk.

Learning the basics of SQL is a very important countermeasure, and you can get a ton of mileage out of pretty simple SQL. Practice it, maybe setting up your own MySQL instance to do so. Pull some public data in and play with it. Inner and outer joins, grouping and aggregation functions, nested subqueries. The basics are transferable across almost all the data warehouses you may encounter, which is fortunate, because AFAIK there's no cheap way to practice on the enterprise-level systems you'll actually end up using (Redshift, Vertica, Teradata, SQL Server, Oracle, Hadoop/Hive etc). Read enough to know what people mean when they talk about relational data, normal form, ETL, columnar data stores, indexes, and data cubes.

Have at least one visualization tool in your pocket. This could be Python (matplotlib, etc) if you're better with it than I am. R is capable of producing readable graphs with something like ggplot2. Excel is poor, in my humble opinion, but it's universal and if you're good with pivot charts it may be enough. I'd suggest playing with Tableau Public, though; Tableau is fun and powerful, Public is free (but restricted to public data), and it's a hot skill to have on a resume right now. (Although learning Tableau has the downside that your next employer may not use it, of course.) R and Python have the great advantage that you can always get budget approval for them :).

Speaking of hot on a resume, machine learning is the future and it ties together a lot of related skills, and it's a discipline where your higher education could be very useful. I've been moving that way lately, although I still know very little about it; my favorite tool right now is KNIME, which is free and awesome and may be worth checking out. With the "Knime (Labs)" interactive R nodes (not sure if those come by default or if you have to install them separately), it plays very nicely with R, too.

I also recommend having at least a basic familiarity with the Unix command line if you don't already. From what I can see the industry is moving towards--and here's a wonderful word--a complete embrace of a bricolage style approach, and Linux supports this well. There are dozens of different tools and technologies out there worth learning, and the best approach for solving any particular problem using takes a combination of several of them, each used to solve the part of the problem that they're strongest at. I may have a python job that once a day runs some SQL on my data warehouse and downloads the result set, which then gets pulled into a KNIME workflow that tests different model fits in an R node, finds the best one, and then sends the result to a python script that builds a report with pretty graphs that are then sent out to an email list. In other words, you and I are not going to have any shortage of things to learn for the next few decades.

(FWIW, I do not agree with the distinction SemiSalt draws between SQL and T-SQL. Transact SQL is just Microsoft's flavor of SQL for e.g. MS SQL Server, maybe Access too, but I've never used Access. T-SQL is kinda neat, it extends the language to include some flow control stuff for example, but it's not like a fundamental category of the SQL standard and you'll never see mention of it outside of MS products. And there's SAS, and SPSS, but SASS is not a thing.)
posted by kprincehouse at 2:37 AM on April 5, 2014 [10 favorites]


« Older Scottish B&Bs in May: Book in advance, or...   |   Electrically flummoxed. Newer »
This thread is closed to new comments.