Why would raking/iterative proportional fitting be making my sample *less* efficient?
March 1, 2008 10:50 AM   Subscribe

Praying that some statistically-minded MeFites are online today! I'm trying to run a generalized raking (aka "iterative proportional fitting") algorithm on some survey data as a poststratification approach. The literature I've read suggests that this should both improve the quality of my estimates and potentially reduce my standard errors by a decent amount. Instead, my SEs are increasing by a factor of about 1.4 (ugh!!!) Can someone offer some suggestions as to why this might be happening?

What should I be looking for in my diagnoses of this problem?

More info, if it helps:
- My sample size is not overly small (n=1500 for a population of about 190000)
- I'm using R
- After some initial problems getting convergence on the full range of variables/levels I wanted to rake on, I've got it reduced to a simple combination yielding 12 total cells -- I just don't have the full joint population distribution for these, hence the raking.

I like the face validity of the new estimates I'm getting compared to those run on the non-raked data. But the increase in margin of error is bugging me, and limiting my ability to find meaningful subgroup differences. Any insight on how to wrassle this alligator would be much appreciated!
posted by shelbaroo to Science & Nature (2 answers total) 3 users marked this as a favorite
 
The following answer is based on no knowledge whatsoever of the statistical technique you're using. However, it sounds like you are introducing a model with additional parameters, and you have had some problems getting estimates for those parameters to converge. Could your wider standard errors be a consequence of a model that is not well specified by your data (overfitting) relative to your original model?
posted by drdanger at 12:15 PM on March 1, 2008


Hmmmm. I work more in econometrics and I'm not very familiar with the technique you're using but IIRC correctly this approach generates a Hessian matrix which can then be used to generate asympotic estimators of standard error. Gosh without getting out a text (away from my books at present) it almost sounds like your data is heteroscedastic. Have you tried visual inspection of the data series?
posted by Mutant at 12:37 PM on March 1, 2008


« Older Dallas Bored   |   Hunting for a headhunter Newer »
This thread is closed to new comments.