# How do I teach myself something when I don't even know what I want to learn?

August 25, 2011 7:29 AM Subscribe

I think I need to learn about time series analysis. My statistics background is so mediocre that I'm not even sure this is the thing I actually need to learn. How far in over my head am I?

My level of statistics knowledge could be described roughly as follows: a semester of something like "applied statistics for grad students in the social sciences", a summer of doing ANOVA in SPSS for a psychology lab several years ago, several workshops on the use of R, and a lot of casual exposure to linear and logistic regression.

What I'm trying to do now is figure out if two variables that occur at different points in time influence each other. So I have X and Y, both of which would normally be treated as dependent variables and both of which are binary. I also have a few predictors that are known to influence each of these variables (different predictors for each). X and Y never occur at the same exact point in time, nor do they occur in any sort of regular pattern. The time is measured continuously.

What I want to know is whether the value of X (1 or 0) is influenced by the previous instance(s) of Y and/or the previous instance(s) of X itself, with the distance of those previous instances, as well as the values of those other predictors I mentioned, taken into account. In other words, does an instance of Y=0 make it more likely that the next time X occurs it will be 0? I would like to know the same thing for Y, but I have a feeling I need to do each separately. But is either even possible?

I took out "Applied Time Series Analysis: Vol. I Basic Techniques" (Otnes & Enochson 1978) but it looks...not so basic. Or so applied. It seems to emphasize spectral analysis, digital filters, Fourier transforms...perhaps "time series analysis" isn't really what I'm looking for? Should I be looking into something like hidden Markov models instead? Or something I've never even heard of?

Comments on what kind of statistics I need, plus pointers to easy-to-follow resources for learning about those statistics, would be very appreciated. I prefer to use R, so bonus points for anything that will walk me through it in R. If the gap between my background and how difficult of a problem this is makes you think that I'm in way over my head, please feel free to let me know. I can provide more information on what I'm trying to do if that would help.

I did find this old question from 2007 that bears some similarities to what I want to do. After reading it, I suspect the question I should really be asking is: how can I befriend a statistician and convince them to help me? Answers to that question will also cheerfully be accepted! (I'm at a university so there are surely statisticians lurking somewhere)

My level of statistics knowledge could be described roughly as follows: a semester of something like "applied statistics for grad students in the social sciences", a summer of doing ANOVA in SPSS for a psychology lab several years ago, several workshops on the use of R, and a lot of casual exposure to linear and logistic regression.

What I'm trying to do now is figure out if two variables that occur at different points in time influence each other. So I have X and Y, both of which would normally be treated as dependent variables and both of which are binary. I also have a few predictors that are known to influence each of these variables (different predictors for each). X and Y never occur at the same exact point in time, nor do they occur in any sort of regular pattern. The time is measured continuously.

What I want to know is whether the value of X (1 or 0) is influenced by the previous instance(s) of Y and/or the previous instance(s) of X itself, with the distance of those previous instances, as well as the values of those other predictors I mentioned, taken into account. In other words, does an instance of Y=0 make it more likely that the next time X occurs it will be 0? I would like to know the same thing for Y, but I have a feeling I need to do each separately. But is either even possible?

I took out "Applied Time Series Analysis: Vol. I Basic Techniques" (Otnes & Enochson 1978) but it looks...not so basic. Or so applied. It seems to emphasize spectral analysis, digital filters, Fourier transforms...perhaps "time series analysis" isn't really what I'm looking for? Should I be looking into something like hidden Markov models instead? Or something I've never even heard of?

Comments on what kind of statistics I need, plus pointers to easy-to-follow resources for learning about those statistics, would be very appreciated. I prefer to use R, so bonus points for anything that will walk me through it in R. If the gap between my background and how difficult of a problem this is makes you think that I'm in way over my head, please feel free to let me know. I can provide more information on what I'm trying to do if that would help.

I did find this old question from 2007 that bears some similarities to what I want to do. After reading it, I suspect the question I should really be asking is: how can I befriend a statistician and convince them to help me? Answers to that question will also cheerfully be accepted! (I'm at a university so there are surely statisticians lurking somewhere)

Your university might have a social science research center that employs statisticians for just this purpose. If not, can you talk to your academic advisor or another mentor? Ask them what they would do.

posted by k8lin at 7:55 AM on August 25, 2011

posted by k8lin at 7:55 AM on August 25, 2011

Best answer: Take a look at Granger causality and vector autoregression. Granger causality is limited to two variables (does X predict future Ys). VAR can include more than two variables.

posted by logicpunk at 8:06 AM on August 25, 2011

posted by logicpunk at 8:06 AM on August 25, 2011

Basic Econometrics by Gujarati and Porter is a good text. (I linked to the most recent edition, but any of the earlier editions are fine for your purposes).

Econometrics is mainly time series analysis, though their examples are mainly economics they should be directly applicable to your problem.

posted by shothotbot at 8:41 AM on August 25, 2011

Econometrics is mainly time series analysis, though their examples are mainly economics they should be directly applicable to your problem.

posted by shothotbot at 8:41 AM on August 25, 2011

I find it funny that ROU and I had the same phrase for time series analysis with an eye to causal inference: dark magic.

Your mentor should know how to get a stat collaborator / consultant. As mentioned above, depending on your dept this may be something that people in your field do all the time in a particular way. Some institutions have discipline specific stat services, others lump everyone into a stat / applied stat department. It's wholly usual for grad students / postdocs to even be expected to consult on complex applied problems.

posted by a robot made out of meat at 8:44 AM on August 25, 2011

Your mentor should know how to get a stat collaborator / consultant. As mentioned above, depending on your dept this may be something that people in your field do all the time in a particular way. Some institutions have discipline specific stat services, others lump everyone into a stat / applied stat department. It's wholly usual for grad students / postdocs to even be expected to consult on complex applied problems.

posted by a robot made out of meat at 8:44 AM on August 25, 2011

I'm not a stats guru but I'd call it a multivariate temporal point process.

posted by Ian Scuffling at 11:24 AM on August 25, 2011

posted by Ian Scuffling at 11:24 AM on August 25, 2011

Best answer: I have a tutorial on my hard drive from an old labmate of mine that may help you out a bit. It's a little "here's R" and a little "here's time course analysis" and a little silly, but it might help you out a bit. I'll ask her if she minds if I pass this along. If you're interested, memail me an e-mail address you'd like to receive files at.

posted by knile at 2:53 AM on August 26, 2011

posted by knile at 2:53 AM on August 26, 2011

This thread is closed to new comments.

Offer to buy the statistician lunch. I know in my department, poli sci, we have people who do time series pretty regularly.

posted by quodlibet at 7:43 AM on August 25, 2011