What techniques should I use to tease out leading indicators from a set of data?
March 26, 2010 8:35 AM   Subscribe

This is a question for the statistics / data analytics junkies out there. What techniques should I use to tease out leading indicators from a set of data?

So let me set up the question. I have a large set of data. I have a bunch of individual objects and they transition between a finite number of discrete states. I've reduced the data down to a list of state transitions which say that this object moved from this state to this state at this time. What I'm trying to do is determine if any of these objects are "influencers". For example, when they move to a particular state, a bunch of other ones consistently follow.

Does anyone have any idea where to start looking into how to do this. While I lack statistics experience, I have a mathematics background, so I'm hoping I can handle any references you may have. I just don't know where to start looking. I can't even think of which terms I should Google!

Thanks!
posted by AaRdVarK to Science & Nature (4 answers total) 2 users marked this as a favorite
 
Best answer: Well, as a very simple start, you can test against a null that the transition probabilities of each pair are independent, you can probably massage Pearson's chi-square into something like this.

Or, the semi-stats way that I'd opt for would be to run the chains themselves for a long time and look for correlations in the sequences of states each object goes through (you can google around for correlation of sequences, there's some literature on it).
posted by devilsbrigade at 8:40 AM on March 26, 2010


Best answer: Liang et al.'s REVEAL algorithm might be of help.
posted by PMdixon at 8:52 AM on March 26, 2010


Best answer: I've never used it, as I don't much deal with time series, but IIRC this is one of the things vector autoregression is used for.
posted by ROU_Xenophobe at 10:27 AM on March 26, 2010


Best answer: Seconding devilsbrigade suggestion. If you want to get fancier, you could try principal component or neural network analysis.
posted by surfgator at 3:39 AM on March 27, 2010


« Older urologist in philadelphia?   |   Cleanliness is next to godliness. So, one who's... Newer »
This thread is closed to new comments.