Do I dare to crunch some data?
October 6, 2012 3:14 PM   Subscribe

Are you a data quality analyst? If so, tell me more...

I've noticed a fair number of "data quality analyst" descriptions on job boards recently. Obviously the titles vary a bit but the job descriptions generally all call for someone to examine and manipulate data and help to ensure data integrity in consultation with business SMEs.

Since I do quite a bit of SQL querying and data examination in my current field, and said field (software testing) is becoming largely the domain of "developers in test" and full-time automators in my neck of the woods, I'm wondering if data quality analysis would be a good fit for me. (My gut tells me yes, but I'd like to hear from some people who really do this.)

So.... is anyone out there a data quality analyst?

If so:

* What is your typical day like? (Which tasks do you do regularly?)
* What kind of educational resources are out there to help you be better at your job? Online forums? Books? Training courses? Professional societies?
* Does your company pay for your training?
* Do you do any programming or scripting?
* Are you also a DBA or have any kind of certification in DB programming, management?
* Are you expected to be a SME in your company's main business (for example, if your company does health care, were you a health care professional at one time)?
* What do you like and dislike about the job?
* Hours? 40 a week? More?
* Stress levels?
* Gestalt feeling about job security and growth opportunities?
posted by Currer Belfry to Work & Money (5 answers total) 16 users marked this as a favorite
 
Best answer: This is an area I'm just getting into at work. Given my manager and I have no experience in this area, we're currently evaluating the online Data quality classes at eLearningCurve. My guess is we'll pursue a series of classes from them as a starting point.
posted by bluesapphires at 4:54 PM on October 6, 2012 [1 favorite]


Best answer: Disclaimer: I am not a full-time expert data quality person, but I have been sort of doing it part-time as part of my job for the last few months (it's maybe 40-60% of my time) in a retail setting. So the following might not all be right, nor will it necessarily be applicable outside a retail environment (i.e. a lot of customer-generated data) but is maybe indicative of what an entry-level position in the work is like -

Tasks:
Generally my day-to-day involves:
(1) Examining the data - generally this starts with a question from a business person or stakeholder ("What percentage of customers have invalid phone numbers?"), sometimes it's to solve a data-related problem ("This function isn't working correctly and I'm sure it's not my code") or sometimes it's just a hunch I've got that something isn't right.
(2) Preparing evaluations of said data - I put together a report that has pretty graphs and/or charts on it to explain what's going on with the data.
(3) If the data is crappy, recommend something to do about it
(4) Make a report/presentation of the above

... and doing the above in a sensible, repeatable, measurable, and comparable-against-other-problems manner. The more automation, the better. I spent a lot of time setting things up so that I could press a button and it would spit out numbers, and all I had to do was analyse said numbers.

Education / training:
I've actually done no formal training in anything labelled "Data Quality" - about the most useful education I've had is from documentation on the system whose quality I'm ascertaining, and the rest of it was learnt from people on the floor who had done data quality in the past. That said, it would definitely have been useful to have some training - and my company would have paid for it, if I'd asked - I was just too inexperienced at the start to know anything about it.

Programming / scripting / other knowledge things:
A lot. I use a combination of SQL, scripting, and Excel (including VBA) to do my work. Also some work in enterprise-level tools like IBM QualityStage (which you are not expected to know if it's an entry-level data quality job - they know they have to train you up in it because it's enterprise-only). Basically though, I use what's available, and my bosses don't care how I arrive at the answer so long as I can defend it.

I'm not a Certified DBA (if there's even such a thing; is there?), but I do have a computer science degree so they knew I had the skills.

I am not a SME in the company's main business. I think if you are, that would be a bonus, but it's easier to take a data quality techie and teach them the broad strokes of how (say) heathcare works than the reverse, so it's not generally expected.

Likes and dislikes:
It's a decent amount of analysis and being left alone to gopher around with the data, which is fantastic but also a lot of careful stakeholder management. You see how the job ads mention "business SMEs" - well, in my workplace at least, those guys are the ones who 'own' the data and my job is to tell them (nicely) that "Your data sucks, do xyz to fix it" and if you haven't done your groundwork to gain their trust they're not going to believe you.


Broadly speaking ... if you enjoy analysis and generally gophering around with data trying to find answers to problems, it's a great place to be. You'll have fun, and you'll also come away with a vast knowledge of the various quirks of the system whose quality you're evaluating, from DB oddities to certain bits of behaviour encouraged by UI quirks, and in my (admittedly biased) opinion, that's very useful.
posted by Xany at 5:52 AM on October 7, 2012


Best answer: From a MeFite who would prefer to remain anonymous:
I've asked the mods to post this for me because I don't want my job to be too much associated with my MeFi username. I work in data for a local authority and one of my team's de facto functions is data accuracy. Our more important tasks are around statistics and reporting, but as you can't do much with the data unless you're sure it's accurate we end up with some responsibility for data accuracy too.

In answer to your specific questions:

* What is your typical day like? (Which tasks do you do regularly?)

Regular tasks linked to data accuracy include:
Reviewing the data, trying to get a sense of what’s right, whether things have changed or whether there’s a recording issue.
Running some regular reports looking for missing or inaccurate data (people without ethnicity for instance).
Being told about missing information and getting hold of operational staff to ask them to record it.
Actually recording or amending data in some circumstances.
Writing guidance notes to support better recording (the training team should do this, but do not).
One-to-one support to staff in recording – we try to avoid this as it’s so labour intensive.
Presenting data to teams and managers with discussions about possible data accuracy issues as part of that.
Reviewing the system to try to ensure it supports good recording (eg are there too many items on some of the picklist) and working with the system team to try to correct these things.
Checking all start dates on one process that’s frequently recorded wrongly.

* What kind of educational resources are out there to help you be better at your job? Online forums? Books? Training courses? Professional societies?

There’s a forum on data for my type of work and we often discuss accuracy on that. We also complete returns for central government and there is guidance for those that covers recording.

* Does your company pay for your training?

I’ve not had any training specifically on data accuracy.

* Do you do any programming or scripting?

Not programming, but writing SQL-based reports to retrieve data.

* Are you also a DBA or have any kind of certification in DB programming, management?

No.

* Are you expected to be a SME in your company's main business (for example, if your company does health care, were you a health care professional at one time)?

Not sure what an SME is in this context – a professional? In that case, I’m not, but some colleagues are, and extensive knowledge about the organisation’s legal framework and policies is required.

* What do you like and dislike about the job?

I quite like doing the actual data accuracy work itself, as it’s a break from the more complicated and stressful parts of my job. Having said that, there are ways in which it can be stressful itselt:

- It is an enormous timesink and has to be managed carefully to avoid taking up too much resource. In the past I’ve managed staff who have been too perfectionist about sorting out accuracy issues – we just don’t have the time always to get it exactly right – and this can be difficult.
- We have a number of frustrations about the system we use and ways in which the setup, which we often can’t change, leads users to record wrongly or poorly.
- At the moment we have significant staff turnover and that is also frustrating as it means there are a high number of staff who haven’t had training in what to record.
- There are other things which mean that improving accuracy can be out of our control: high levels of demand for the service increasing workloads, for instance, and the available resource and skill of the training and system support teams.

* Hours? 40 a week? More?

I work 37 hours a week. Around 10-12% of my team’s time is spent on data accuracy, but this varies over the year.

* Stress levels?

I find my job fairly stressful, but data accuracy is usually the least stressful part of it. Times when this aspect has been stressful are:

- System problems sometime mean some important piece of information about a service user can’t be recorded and I can get stressed about that.
- I also don’t particularly enjoy the staff contact you have to have if you really want to get data right.
- We once had severe and stressful problems with external audit when we have been found to have specific data accuracy issues of which we were unaware.

* Gestalt feeling about job security and growth opportunities?

My feeling is that data accuracy in my organisation won’t be prioritised as managers struggle to understand its importance. The demand for reporting on the data is increasing all the time and I think will continue to do, hopefully meaning our jobs are safe, but the part of our time spent on data accuracy will increasingly be squeezed.

Hope this is useful, and sorry about the length.
posted by jessamyn at 8:09 AM on October 7, 2012


Best answer: You should look into MDM. Master Data Management.

You're just about to break into data warehousing. It's a good field.

My company does pay for training. We send everyone to the TDWI conferences when they're in town.

My boss hates working more than 40 hours a week. I work about the same, although we all know that the week of our trade show will be insane.
posted by krisak at 8:27 AM on October 7, 2012


Response by poster: Threadsquatting for a moment.

Thank you so much everybody, especially for the level of detail.

For those who are interested, I found The International Association for Information and Data Quality. There's what looks to be a good bibliography in the non-members portion of the site, some introductory articles, a certification exam (caveat emptor, as is always the case with certifications), and a job board for members.

f you enjoy analysis and generally gophering around with data trying to find answers to problems, it's a great place to be

This is about as good a description of me at work when I'm doing something that I like to do. "Gophering" - I love it.
posted by Currer Belfry at 2:45 PM on October 7, 2012


« Older Blindfold Before Leviathan?   |   What part of the earth gets the most moonlight? Newer »
This thread is closed to new comments.