Statistical analysis help for a survey
December 1, 2013 5:20 AM Subscribe
I have 50+ responses from large companies for a survey that I've written which has approximately 100 questions. There is no other data that can be linked to this survey. I need to know what I can do with these results and how to do it.
The first part of the survey ask basic stuff such as location, sales, number of employees, etc.
The second part (the bulk of the questions) has questions (asking about specific parts of their business, such as 'My company has a flat organizational structure') that measure a rating scale from 1-5 (of how much they agree) and then asks how important that same question is to their business performance, from 1-3.
I understand that this is a broad question, but what can I do with this data? Off the top of my head, I can show:
1) correlations between questions (companies who have flat structures are more likely to have higher sales),
2) correlations between impact (companies who feel that flat structures are important to their business are more likely to also feel that open offices are important).
3) Correlations between basic data and 1-5 questions (companies who spend more than 5% of their budget on R&D are more likely to have flat structures)
What else can I show? I also see other statisticians running some complex mathematical calculations- can I do that?
What are some resources that can get me started in the right direction in doing it? What software program will get me there? Stata, R, SPSS?
Thanks!
The first part of the survey ask basic stuff such as location, sales, number of employees, etc.
The second part (the bulk of the questions) has questions (asking about specific parts of their business, such as 'My company has a flat organizational structure') that measure a rating scale from 1-5 (of how much they agree) and then asks how important that same question is to their business performance, from 1-3.
I understand that this is a broad question, but what can I do with this data? Off the top of my head, I can show:
1) correlations between questions (companies who have flat structures are more likely to have higher sales),
2) correlations between impact (companies who feel that flat structures are important to their business are more likely to also feel that open offices are important).
3) Correlations between basic data and 1-5 questions (companies who spend more than 5% of their budget on R&D are more likely to have flat structures)
What else can I show? I also see other statisticians running some complex mathematical calculations- can I do that?
What are some resources that can get me started in the right direction in doing it? What software program will get me there? Stata, R, SPSS?
Thanks!
Regressions to correlate factor X with factor Y. Maybe principal component analysis to see if you can pull out what types of answers seem to be associated with specific industries or outcomes. That's where I would start.
Not sure this will help you now, but wouldn't it have been better to consult a statistician on what you can do with the data before writing the survey? Your survey would be better-designed if it was set up to give you good comparisons at the beginning.
posted by caution live frogs at 7:20 AM on December 1, 2013
Not sure this will help you now, but wouldn't it have been better to consult a statistician on what you can do with the data before writing the survey? Your survey would be better-designed if it was set up to give you good comparisons at the beginning.
posted by caution live frogs at 7:20 AM on December 1, 2013
You should have consulted with a statistician before making the survey, when they would have been able to provide much more help, but barring a serendipitous discovery of a time machine, you should call these guys. Sorting out shit exactly like this for researchers is pretty much what they do.
They probably also already have some kind of deal with your University with all of the close ties it has to KU Leuven.
posted by Blasdelb at 8:25 AM on December 1, 2013
They probably also already have some kind of deal with your University with all of the close ties it has to KU Leuven.
posted by Blasdelb at 8:25 AM on December 1, 2013
The place to start is deciding what your research question is. Though you don't seem clear on what that is, it's likely you have one hidden somewhere; otherwise why engage in this survey exercise at all? Ask: what theories or debates can I weigh in on with this data collection exercise? Even if it's confirmation.
Try to identify dependent variables (outcomes of interest). What do you want to be able to present? Try to state the relationships you expect, and the causal links (the items influencing the outcomes). Some items are key subgroups to be contrasted.
Tables! Make one-way frequencies for everything. Make two-way (and some three-way) tables for every key relationship, comparison, or breakdown that makes sense, and stare at them for a long time. Patterns and ideas should emerge. (Tables are under-appreciated as an analytic tool.) Test to be sure differences are meaningful, but substantial and meaningful differences are more important than "significant" ones. These patterns are what you start to describe and explain.
With only 50-plus cases, you should be able to add publicly available data about companies to the data set if appropriate for the analysis (e.g. objective measures of performance like sales, growth, or stock value). Or other descriptors of the business, industry, or competition.
Also, one should never try advanced techniques before exhausting all the relevant simple ones. Hierarchical cluster analysis or principal components are possible here, but understand why you are doing them first. Many advanced methods (like, say, structural equation modeling) are inappropriate where the base size is small, as here.
All good comments above.
posted by lathrop at 11:07 AM on December 1, 2013
Try to identify dependent variables (outcomes of interest). What do you want to be able to present? Try to state the relationships you expect, and the causal links (the items influencing the outcomes). Some items are key subgroups to be contrasted.
Tables! Make one-way frequencies for everything. Make two-way (and some three-way) tables for every key relationship, comparison, or breakdown that makes sense, and stare at them for a long time. Patterns and ideas should emerge. (Tables are under-appreciated as an analytic tool.) Test to be sure differences are meaningful, but substantial and meaningful differences are more important than "significant" ones. These patterns are what you start to describe and explain.
With only 50-plus cases, you should be able to add publicly available data about companies to the data set if appropriate for the analysis (e.g. objective measures of performance like sales, growth, or stock value). Or other descriptors of the business, industry, or competition.
Also, one should never try advanced techniques before exhausting all the relevant simple ones. Hierarchical cluster analysis or principal components are possible here, but understand why you are doing them first. Many advanced methods (like, say, structural equation modeling) are inappropriate where the base size is small, as here.
All good comments above.
posted by lathrop at 11:07 AM on December 1, 2013
Response by poster: Thanks you guys/gals for the above info.
I have consulted a statistician in the creation of the survey, he has since moved on (and I can't use his services anymore). As statistics is (clearly) not my specialty, I had thought it was a simple thing of, 'ok, I have this data. Which statistical tools can I take off the shelf and use to mold this data into a presentable thing?' Alternatively, I had thought that the magic of 'big data' would be able to find some interesting correlations between all these survey responses.
I'm going to take all of your advice and give it a think and come back to you.
BTW, do you have some sort of theme or sub-topic within statistics that I can look into?
Thanks!
posted by JiffyQ at 12:45 PM on December 1, 2013
I have consulted a statistician in the creation of the survey, he has since moved on (and I can't use his services anymore). As statistics is (clearly) not my specialty, I had thought it was a simple thing of, 'ok, I have this data. Which statistical tools can I take off the shelf and use to mold this data into a presentable thing?' Alternatively, I had thought that the magic of 'big data' would be able to find some interesting correlations between all these survey responses.
I'm going to take all of your advice and give it a think and come back to you.
BTW, do you have some sort of theme or sub-topic within statistics that I can look into?
Thanks!
posted by JiffyQ at 12:45 PM on December 1, 2013
Well, there are statistical tools you can take off the shelf and use to mold this data into a presentable thing, the tests you need have been around for generations. You may even be able to do things that are cooler than you are probably expecting like multivariate analysis, the only problem is that to responsibly use these tools you've really got to actually understand them and the ways they fail. A graduate level course in statistics would give you the judgement you would need to decide between the three or four kinds of statistical questions you could ask (depending on the kinds of strengths and weaknesses you want your answers to have) yourself, as well as the skills to do the analysis yourself. Alternatively this should be about an hour's worth time, billable or otherwise, for someone who knows what they are doing to explore it with you.
The literature is littered with both the painfully obvious and, much worse, deceptively hidden results of overly bold statistical illiteracy.
posted by Blasdelb at 3:19 PM on December 1, 2013
The literature is littered with both the painfully obvious and, much worse, deceptively hidden results of overly bold statistical illiteracy.
posted by Blasdelb at 3:19 PM on December 1, 2013
« Older Can I get my 11 month old to sleep through the... | Help me find a smallish plain white china butter... Newer »
This thread is closed to new comments.
What are your non-descriptive questions measuring exactly? Personality traits? Work-related behaviors? Feelings regarding their organization/roles? It's not exactly clear by your descriptions. If you don't know yourself, then you should first conduct an exploratory factor analysis to see which questions (or "items") correlate to form a specific construct of interest (such as personality traits, work-related behaviors, feelings toward their organization, etc.). If each individual item measures something different...well you can still do most analyses on single items.
Before you begin ANY analyses, you need to come up with a few research questions or hypotheses in order to direct your analyses. Otherwise just throwing in a bunch of random analyses, like correlations between questions, is meaningless. In other words, what do YOU as the analyst (or whoever requested the survey) want your data to SAY? I could help guide your analyses if you had specific questions/hypotheses you wanted to answer.
"I also see other statisticians running some complex mathematical calculations- can I do that?"
Sure...but do you NEED to? And furthermore, does your data allow it? I think you're jumping the gun here. Complex is not always better when analyzing data. The goal is to find the best test to answer your questions/test your hypotheses.
What are some resources that can get me started in the right direction in doing it? What software program will get me there? Stata, R, SPSS?
SPSS and STATA are more user friendly than R or SAS.
posted by Young Kullervo at 6:05 AM on December 1, 2013