RRPQBBQ
February 29, 2016 2:38 PM Subscribe
How exactly do you score this Reactions to Research Participation Questionnaire Revised (RRPQ-R) form? I see some hints on the last page, but what steps would a researcher take to turn a stack of completed forms into data the proper way? I could take a guess but I'm curious to know the actual method. Google is seemingly empty of the answer. I do not think I have access to the sources on the last page.
Looks like they did Factor Analysis on the questionnaire. Factor Analysis allows you to take a lot of variables (e.g., the answer to each question), and boil them down to a smaller set of variables (the five factors they list below). So, questions 14, 15, 17, and 21 all measure the participant's feelings about participation in some way.
They have the factors and item loadings in their original paper (check your memail); I'm guessing you could either (1) do factor analysis on your results and see if your results match theirs, or (2) use their model to get factor scores for your data.
posted by damayanti at 3:23 PM on February 29, 2016
They have the factors and item loadings in their original paper (check your memail); I'm guessing you could either (1) do factor analysis on your results and see if your results match theirs, or (2) use their model to get factor scores for your data.
posted by damayanti at 3:23 PM on February 29, 2016
The completed sheets would be used to create a survey dataset. There are a lot of ways to do data entry, but unless you have thousands of questionnaires, this one would probably just be hand-typed into Excel by a lowly research assistant, with a row for each response and a column for each variable. The first question would be ten variables - nine to hold the ranking (1, 2, 3, or missing) for each possible response, one for the text of the other-specify response. Then you'd type numbers into twenty-three columns for each of the part 2 questions, where 1 = Strongly Disagree, 2 = Disagree, et cetera. Depending on how much analysis you're planning to do, you'd import this into stats software like R or SAS or SPSS or Stata, or just leave it in Excel.
Once you have all your data together, you'd recode the eight questions listed in the "Reverse Score Items" so that 1 = 5, 2 = 4, 4 = 2, and 5 = 1, so that the direction (higher number = better) of the scale is the same for all 23 questions. (By the way, the five-point agree/disagree thing is called a Likert scale.) Then those five factors would be constructed using the listed questions. There's no specifics about how you'd do that, but I suspect the authors just used the mean. So for example, the Participation Factor would be the average of the numeric responses to questions 14, 15, 17, and 21. You'd calculate this score for each response. Sometimes questions are given different weights, but I don't see anything about that here. There are some good arguments for why calculating the averages of Likert scores isn't a great idea, mathematically speaking - for example, there's no reason to think that the 'distance' between Strongly Agree and Agree is the same as the distance between Agree and Neutral, and Neutral is often a synonym for "I don't know or care" which shouldn't be included in an indicator calculation - but it's done all the time anyway.
posted by theodolite at 3:23 PM on February 29, 2016 [1 favorite]
Once you have all your data together, you'd recode the eight questions listed in the "Reverse Score Items" so that 1 = 5, 2 = 4, 4 = 2, and 5 = 1, so that the direction (higher number = better) of the scale is the same for all 23 questions. (By the way, the five-point agree/disagree thing is called a Likert scale.) Then those five factors would be constructed using the listed questions. There's no specifics about how you'd do that, but I suspect the authors just used the mean. So for example, the Participation Factor would be the average of the numeric responses to questions 14, 15, 17, and 21. You'd calculate this score for each response. Sometimes questions are given different weights, but I don't see anything about that here. There are some good arguments for why calculating the averages of Likert scores isn't a great idea, mathematically speaking - for example, there's no reason to think that the 'distance' between Strongly Agree and Agree is the same as the distance between Agree and Neutral, and Neutral is often a synonym for "I don't know or care" which shouldn't be included in an indicator calculation - but it's done all the time anyway.
posted by theodolite at 3:23 PM on February 29, 2016 [1 favorite]
Response by poster: Thank you, everyone, for the explanations and the paper from damayanti! It's certainly more clear now.
posted by michaelh at 12:54 PM on March 1, 2016 [1 favorite]
posted by michaelh at 12:54 PM on March 1, 2016 [1 favorite]
This thread is closed to new comments.
First, I would get the data into a database. If it was a large survey, I would design the form so that I could use teleform (or something like it) to read and extract the data. If it was a small survey, I would hire someone to do data entry, probably some form of double data entry. Either way, at the end, I would have a database that had all the responses in it.
Second, I would transform the data using the rules on the last page. The way the questions are written, most of the items are coded from a less desirable outcome to a more desirable outcome. For example, the first item -- "I gained something positive" -- strongly agree is likely the more desirable outcome. But 8 of the items (3, 5, 6, 10, 16, 18, 19, and 20) are the other way around -- they are coded from a more desirable outcome to a less desirable outcome. For example, item 3 -- "The research raised emotional issues" -- strongly disagree is likely the more desirable outcome. So I would write a series of recodes that would flip those items around. The code for this would look something like this, although there are other shorter ways of doing this.
IF ITEM3=1 THEN ITEM3R=5;
ELSE IF ITEM3=2 THEN ITEM3R=4;
ELSE IF ITEM3=3 THEN ITEM3R=3;
ELSE IF ITEM3=4 THEN ITEM3R=2;
ELSE IF ITEM3=5 THEN ITEM3R=1.
Third, once the data for those items were transformed, I would create subscales for the 5 subfactors that are listed (Participation Factor, Personal Benefits Factor, etc.). There are two basic approaches to this. You could either add up the items in each factor (so for participation, add the values for items 14, 15, 17, and 21). You'd end up with a scale that ranges from 4 to 20. The other way is to take the mean of the four items. You'd end up with a scale that ranges from 1 to 5. Whichever way you do it, you'd have to first decide what to do about people who didn't answer one of the items -- throw them out? Impute a value? Do something else? How you handle missing values, and which way you calculate the score (summing versus mean) is partly a matter of what the established literature for this questionnaire does, and partly a matter of professional judgement.
Professional judgement is the tricky part -- this looks like a self-administered form to me, which means no one read the questions to the person responding or entered their answers for them. There will be a LOT of missing data on a self-administered form, so that will be a huge issue to grapple with, because not answering could mean many different things.
There are of course other ways to score this kind of scale, using more sophisticated statistical techniques. The only way to know if one of those methods is required is to be informed on the research literature for this questionnaire.
I took a quick peek at some journal articles for this measure, and couldn't immediately find any more precise rules for scoring it. I'm sure they exist, but I couldn't find one quickly.
posted by OrangeDisk at 3:21 PM on February 29, 2016 [1 favorite]