Open-ended survey data, short time frame: help me brainstorm!
January 16, 2015 6:00 PM   Subscribe

I've been handed a massive chunk of open-ended survey data. I have a short time frame to work with it--just a few days--and I would appreciate advice on ways to analyze or mine what I have beyond the most basic approach I know for this kind of thing. Exciting details inside!

The material: a series of training courses for emerging experimental methods is in full swing, with each course followed by an open-ended survey. This survey was designed by one of the partners involved in hosting the training sessions, so the survey wasn't put together with the intention of hypothesis testing. Consequently, the questions are very basic with mostly open-ended responses only or sometimes open-ended responses along with some simple categorical variables.

For example, a prompt might read, "Regarding the level of difficulty of the material presented:" followed by a four part categorization response ("too difficult; difficult; easy; too easy") and then an open ended elaboration ("tell us why you answered the way you did"). The entirely open-ended questions are really broad ("did we omit anything you would have liked to know about?"), and the responses are understandably highly variable in length and content.

I'm used to working with lots of purely quantitative data in analysis-friendly form, but my knowledge of this kind of analysis is pretty basic. I've already followed this kind of approach--read the responses, group them into categories, identify major themes. And it's been really helpful and illuminating! But it seems incomplete, or at least a waste of a vast resource. I know it sounds dull, but I didn't realize how deep and descriptive the responses would be until I saw these and I want to try to wring as much juice out of them in the short time I have as possible.

What do you say, MeFites? Have you worked with loads of open-ended surveys in interesting ways? Can you point me to relevant resources? Have any neat ideas? Let me know!

Limitations:

Time (I need to have an initial analysis next week before we share it with others outside my org for additional input unrelated to the survey data).

$ (I don't have access to SAS through my employer, so if there's something I can do with R or Excel, etc., that'd be very helpful).

Data access (I only have the final reports in .pdf form, not raw data--although I'm checking to see if that can change. I don't know how this information was collected or collated, either).
posted by late afternoon dreaming hotel to Science & Nature (5 answers total) 7 users marked this as a favorite
 
Best answer: Let me give you the answer you would be using if you had time an I'll leave it to others and you to figure out how to get that done in a few days.

You don't need excel or SAS or R. Those are for quantitative analysis. You need software for qualitative analysis. The two most popular packages are NVivo and Atlas.ti.

What you would do with this software is import your documents. Add properties to the documents -- those would be the kinds of things you have in the quant data. So survey answers from survey 8293 would be known to be from someone who was female, took the course on course on a wednesday, is married, and found the material "moderately difficult" etc. You should be able to import this stuff as a spreadsheet.

Once you've imported the documents and added their properties, you would code the data. Essentially, you have a list of topics/themes (the codes). You select parts of the text and click the code you want to attach. You do this for all your documents. For example, one of your codes might be "pacing." If someone wrote "I thought there was a lot of material crammed into the morning session, but the afternoon felt too slow." You would select that text and add the "pacing" code. You might also attach "positive" "negative" or "neutral" to each comment piece.

The time consuming part here is coming up with rigorous definitions of each of your codes, especially if you have more than one person coding. If you, the first thing you will do before coding everything is establish intercoder reliability. All coders code the same few documents. Compare codes. Discuss where you disagree. Rinse and repeat until people are coding very similarly.

Once the coding is done you can start pulling up whatever you want. You can look at everything anyone ever said about pacing. You can look at what women said about pacing vs. men. Of course, the software lets you look at it. You're the one who needs to read the women's coments and men's comments and see if the tone or content are different. You can find out what people who found the material moderately difficult said about the morning session, or whatever you want.

THat's not really something you can do in a few days, though. You'll have to adapt as necessary.
posted by If only I had a penguin... at 6:30 PM on January 16, 2015 [7 favorites]


Best answer: IOIHAP has a great overview of qualitative analysis - but yeah, that's weeks of work.

I will say, as someone who does both quantitative and qualitative work, you're likely never going to feel "done" with your analysis. There is never that clear p. < 05 moment where you can say "GOT IT." It's all kinds of existential.

It sounds like you're on the right track in looking for themes among the responses, and as far as you can get with a more structured coding process as described above, that's great. The only other "hack" I can think of, if you can get the responses as text data (I'm not clear if your PDFs are just image files or have text you can easily export) is to do some basic text mining. You can even do this with something like word counts in Excel. If you could decide (from your initial readings) on some words of interest, you can maybe capture some descriptive data pretty quickly that might be helpful (e.g. 40% of participants used the word "confusing").

Good luck!
posted by pantarei70 at 9:19 PM on January 16, 2015


Best answer: It sounds like you are on the right track. In addition to the excellent suggestions above, I found that when I was working on qualitative data, one of the most high-impact things I could do was to produce a word cloud. It's a nice visual way of showing exactly which ideas came up the most often for a given comment.
posted by roshy at 10:38 PM on January 16, 2015


Response by poster: Many thanks, all. By way of follow-up, I used essentially every suggestion presented here, turned this into a pretty wonderful presentation/poster, and have delivered it at a couple of the big conferences in my field. Next up is turning into a publication. So many thanks!
posted by late afternoon dreaming hotel at 1:39 PM on May 28, 2015


As another follow-up for future readers (or the OP if this ever comes up again), I've just learned that there exists open-source/free qualitative data analysis software, but only for Mac. It's called TAMS and the website is ugly and it's on scourgeforce, but I know at least one person who's used it, so it probably more or less works.
posted by If only I had a penguin... at 7:40 PM on June 2, 2015


« Older 🎶 You know everybody knew our name 🎶   |   What's different between UK and US literary essay... Newer »
This thread is closed to new comments.