How can I work through this bad data?
April 28, 2008 8:18 AM   Subscribe

I'm a graduate student. My advisor asked me to do a favor for a researcher and clean his data for him and put it into SPSS from Excel. Upon looking at the data, I realized that it was totally screwed up. What can I do now?

Specifcally, I realized that the undergrad research assistants misnumbered the questionnaires either on the paper questionnaires or when entering them into the computer (he had them enter them into SurveyMonkey instead of directly into SPSS for some odd reason) and now I am quite certain that the pre- and post-test results for particular ID numbers DO NOT imply that the same person answered.
Additionally there are hundreds of "doubles" (i.e. 2-3 different pre-tests or post-tests for 1 ID number and not always the same number of pre-tests and post-tests) and dozens of IDs with either a pre-test or a post-test but not both. Also, as a secondary problem, there is no consistency in the data entry (some RAs entered 0 for a blank, others put n/a, others put nothing) and LOTS of typos.

I've explained in a dozen different ways why the data is screwed up over the past few months. I've also recommended that the original questionnaires get RE-ENTERED directly into SPSS. The researcher is adamant that the data isn't screwed up (no matter how many ways I show him, like lists of doubles and lists of IDs with only 1 or the other test). He just wants me to "fix" it NOW.

I've already worked way more hours than I've gotten paid for. I have given him an SPSS file with all of the IDs which only had 1 pre-test and 1 post-test and a list of all of the IDs that have multiple pre-tests and post-tests and/or missing pre-tests and post-tests. He could do something with that, if he wanted to.

I have no reason to work with the researcher (not a professor mind you) again, except possibly as a TA (and TAing for this person is pretty terrible). I do like to be in people's good graces generally though and wouldn't want the faculty thinking that I bail on projects. I have tried to explain to my advisor that this situation really stinks, but he doesn't want to get involved. Also the original grad students that worked on this project have all abandoned ship because this person is tough to work with.

What can I do now? Should I send a final e-mail like this: "I've done all that I think that I can do with this data. Perhaps you can use the SPSS file that I gave you, get rid of all of the data, set SPSS to "label" mode and have some RAs re-enter the data? I realize that this is more RA time than you wanted, but with all of this great data in here, it is worthwhile to get it right."

I just don't know how to proceed. Please help.
posted by anonymous to Human Relations (13 answers total) 1 user marked this as a favorite
 
The way to proceed is to graciously bow out and do no more work. You have been much nicer and more helpful than this person deserves or has paid for (unclear if they paid or your department paid, but someone did).

At this point, they are playing on what they see as your character flaw ("I like to be in people's good graces generally though and wouldn't want the faculty thinking that I bail on projects"), which has kept you on board long after everyone else jumped ship. But even if they actually fix the damaged data, this is not something you need to waste your time fixing.
posted by Scram at 8:36 AM on April 28, 2008


Think of your own well being. Are you about to take the fall for some dodgy data? Anyway, cover your bases and talk to your advisor and anyone whose judgment is important in the dept. so that when you lower the boom that the Researcher does not try to pin it on you and then ruin your reputation. This means not being held hostage by the Researcher but pre-empting the Researcher from taking you down with him/her.
posted by jadepearl at 9:03 AM on April 28, 2008


Yeah, I think you know that the right thing to do is to walk away from this. You do not want to be associated with bad research.
posted by ob at 9:05 AM on April 28, 2008 [1 favorite]


Let me see if I can boil this down to the basic facts:

(1) Your adviser asked you to do an unpaid favor for another researcher.

(2) You gave it your best shot, but after putting in a good deal of time you find yourself unable to complete the task because the original data is not in good enough shape.

(3) You politely informed the researcher that this was the case.

(4) Said researcher demands that you complete the (uncompensated) task.

(5) Your adviser - who is responsible for getting you into this mess in the first place - "doesn't want to get involved."

This is an untenable situation in all respects. You ought to dissociate yourself from this researcher immediately, even if it creates a bit of temporary bad feelings. A simple letter of the sort that you suggest will suffice: "I've done all that I think that I can do with this data, and cannot devote any more uncompensated time to this project. I recommend that you hire someone to re-enter the original questionnaires into SPSS."

In addition, make sure to return all information, documents and files to the researcher, and document this fact in writing. And, if possible, keep copies of all written correspondence with him. This researcher sounds like the kind of person who might try to blame you for the errors in the data or some other perceived slight, and you shoujld do whatever you can to pre-emptively protect you from this.

You may also want to give some thought to your relationship with your adviser, who has done you two disservices. First, he asked you to do uncompensated work for a colleague who, I imagine, has a less-than-stellar reputation. This is Not Cool. Second, upon learning that you have been mistreated - which you have - your adviser has backed away from a situation of his own creation. This is Really Not Cool. While not technically unethical, this behavior is irresponsible at best, exploitative at worst.

Finally, if your university has an ombudsman, talk to them - if for no other reason, than to record your side of the story with an impartial third party right now.
posted by googly at 9:06 AM on April 28, 2008 [3 favorites]


Summarize what you've found in writing. Deliver what you have done. Keep copies. Move on to the next piece of work. Don't apologize and don't feel bad.

If you fudge data, modify data, manipulate it without sound reason and traceability, in my book that is unethical. You can't make a silk purse out of a sow's ear. Ethics count. "Fix it" sounds like a command to be unethical.

One of my pet peeves for years has been when a manager tells me "I don't care why it doesn't work, just fix it". I've never quit a job over it, but I have quit working on problems that come with this command.

It's one thing if someone says, "I don't care how much it costs to fix it; just fix it". That implies an open command to do it right. Those, I jump on eagerly. The other type implies "I don't want to know what you are covering up; just cover it up".

Raw nerve.. grrrrr...
posted by FauxScot at 9:11 AM on April 28, 2008 [1 favorite]


Googly, unfortunately, this sort of stuff is pretty normal at research universities. I know grad students who have to watch labs for professors all weekend long or do all of the background work for an article without getting their name on it, for example. I've heard horror stories about baby-sitting, pet-sitting, house-sitting, etc.

The fact is, at big research universities, grad students ARE getting a free ride and, at least in my field, a yellow brick road to a job at another research university... so, it probably all works out in the end.
posted by k8t at 9:13 AM on April 28, 2008


The Ethical Guidelines for Statistical Practice might come into play

http://www.amstat.org/profession/index.cfm?fuseaction=ethicalstatistics
posted by tiburon at 10:14 AM on April 28, 2008


k8t, you're right. This isn't that unusual. However, that doesn't make the OP's situation any less tenable, nor the adviser's behavior any more excusable. My main points are that the OP can take steps to make this situation better, and that the adviser's behavior in this case should give them a bit of pause as to how they might be treated in the future.
posted by googly at 10:37 AM on April 28, 2008


You need to go back to your advisor. Your advisor asked you for this favor and you performed the work as a favor to your advisor. Because of this, your advisor is already involved, and needs to continue being involved. Let your advisor know, among other things, that the amount of work involved in further cleaning this data set is going to interfere with the work that he/she is supposed to be supervising.

Tread carefully. You may not know the politics that have led to your being asked to do this scut work.
posted by ikkyu2 at 11:26 AM on April 28, 2008


Sounds like a difficult situation. If I were you, I would fix the parts that can be fixed, and leave the rest for the researcher to decide. I would also ask the researcher what he means by "fix" it. Does he simply mean replace the current missing variable codes, "0" and "n/a", with a unitary missing variable code? That sounds perfectly reasonable and quite ethical to me, and something that should take you only a few minutes to do. Ask the researcher what, specifically, he wants cleaned up, then do that specific task. Once that task is done, give him that file, and you're done with the job.

Also, FYI, multiple ID numbers don't necessarily mean that the data is messed up. Let's say you have 2 schools, and you assign each school a School ID. Then, you have students within schools, and each student is assigned an individual ID, starting from 1. So there would be two students with the ID of 1, but they are distinct students and you know that you can tell them apart in the data because you also have a school code saying which is which. You also have some students who were in school for T1, and others who were in school only at T2, but you were ethically obliged to allow all students who wanted to participate to do so, so you have a bunch of students who only have a single administration. Also, for some types of designs, you WANT Ps who only participate for T1 or T2, so that you can statistically test for a variety of confounds, including maturation, cohort effects, history, etc.

So, don't assume that the data is messed up unless you have confirmatory evidence and you know for certain that each T1 ID is supposed to have one and only matching T2 ID data, and clean the parts that you can. Give the researcher the entire cleaned dataset, and call it a day.

Good luck.
posted by jujube at 11:37 AM on April 28, 2008


Poorly managed research, bad data, and even worse researcher/PI - sounds just like a project we had a few weeks ago.

Here's how we usually handle situations like this. Granted, we're a corporate so YMMV.

1. Document all the work you've done to date. Don't leave out any small detail - make a bullet point for everything.
2. Inventory all the issues/problems that are still outstanding down to the individual fields.
3. Make a rough estimate of the total time and resources (including personnel) that would be needed to get this bad data into usable form and ready for analysis. If you're still willing to do this work with the right financial/academic motivations, write that up at the end as well.
4. Take this whole package to your advisor. You need to show how much you've already done and explain that you've already gone above and beyond the duties of a simple "favor." Review your planned options with the advisor as well.
5. Set up a meeting with your advisor and the researcher. By now your advisor should be on your side and willing to moderate on your behalf since he/she was the one who asked for a favor from you. Present the whole package, then the options, and then let the researcher decide how he/she wants to proceed.
posted by junesix at 3:33 PM on April 28, 2008


I would first consider ikkyu2's point and first make sure there are no strange political situations here.

Then, I'd make the very best of what you have, email the prof everything you have, and generally assume s/he is going to do the minimum possible in moving forward. Be sure to convey a sense of finality in your email. Your story is that you have completed the task requested, having cleaned what s/he sent to the maximum extent possible. I would cc: your adviser and ask him/her to send a confirmation email, to ensure there's no future funny business from the semi-shady professor.
Dear Professor,

I have cleaned the Project Data to the extent possible at this time, and I am emailing to send you the final files. Please let me know you received this email, since I know attachments can cause messages to bounce.

Attached are two files:
1. FileName#1 -- data with complete records, in SPSS format [the one you already sent]
2. FileName#2 -- data with missing or multiple pre- or post-tests, in SPSS format

I have a few concerns with this data in general, as we have discussed. File #1 may or may not be suitable for statistical analysis. I recommend that as a next step, you have a research assistant ensure that File #1 is correct by:
a)
b)
c)

If you decide to move ahead without taking those steps, I would note these sources of potential error with File #1 data in any report:
1. [write the sentences exactly as you would have them appear in a published paper]
2.
3.
[Optional sentence: Because I believe these issues are potentially so serious, please do not associate me with this data if you choose to go ahead without addressing them.]

Because the records in File #2 are incomplete and/or repetitive, they are not suitable for analysis. If the sample size needed for this project requires that these records be repaired, one of your research assistants could do this through the following steps:
a)
b)
c)

I hope this helps as you move forward. This is a very interesting and worthwhile project, and I wish you the best with the remainder of the research.

anonymous
posted by salvia at 3:39 PM on April 28, 2008 [3 favorites]


Professor (and grad student advisor) speaking: your advisor's willingness to send you out to do another researcher's bidding, and their subsequent unwillingness to cover your ass, speaks volumes. It's reasonable to put you on another study to help you grab an extra (easy?) publication or whatever. But once the situation turns sour, they bear some responsibility to help you step away gracefully. There's plenty of good advice above on the best way to walk away from this project. But if your advisor won't back you up if it goes sour, then you should re-assess whether she/he has your best interests at heart.

If they don't, my advice is to DTMFA.
posted by drmarcj at 8:25 PM on April 28, 2008


« Older Convert old Borland to new Visual Studio?   |   Can I install aftermarket iPod interfaces into my... Newer »
This thread is closed to new comments.