Statistics are hard.
January 17, 2015 8:40 PM
How do you calculate the probability of something when it's not as simple as "do it a bunch of times"? Specifics inside.
I have a system. This system has one button, which always performs the same action. Occasionally, when you press the button, the system crashes and has to be reset.
I've set up an automated button-pusher, but the reset is a manual process. So I end up with several pieces of data that look like this:
- 31 button presses until it crashed
- 42 button presses until it crashed
- 27 button presses until it crashed
It seems to me that the probability of the crash happening isn't as simple as 3% (3/(31+42+27)), because my sampling stops abruptly when the crash occurs.
An analogy: You roll a die, and count the number of times it takes you to roll a 1. The odds of rolling a 1 are one in six, but I don't think you can expect an average of six attempts to roll a 1... Or can you?
Am I just overcomplicating this? I have no idea what sort of statistical language to even search for. All help appreciated, thanks!
I have a system. This system has one button, which always performs the same action. Occasionally, when you press the button, the system crashes and has to be reset.
I've set up an automated button-pusher, but the reset is a manual process. So I end up with several pieces of data that look like this:
- 31 button presses until it crashed
- 42 button presses until it crashed
- 27 button presses until it crashed
It seems to me that the probability of the crash happening isn't as simple as 3% (3/(31+42+27)), because my sampling stops abruptly when the crash occurs.
An analogy: You roll a die, and count the number of times it takes you to roll a 1. The odds of rolling a 1 are one in six, but I don't think you can expect an average of six attempts to roll a 1... Or can you?
Am I just overcomplicating this? I have no idea what sort of statistical language to even search for. All help appreciated, thanks!
It's actually a geometric distribution, and in fact the best estimate of the probability of failure is exactly the figure you're using. (To see the formula, scroll down to the "Parameter estimation" section.) This assumes that the probability of failure is the same for each press of the button, regardless of how long the system has been running to date. Note that in the Wiki article, what you're defining as a "failure" is what they view as a "success", and you're using the first version of the distribution described in the article.
It doesn't matter that the sampling abruptly stops when there's a failure, assuming that the probability of failure remains the same after a reset. Think of it this way: suppose that instead of crashing with probability p, the system dispensed a jellybean to you and kept running. Given your data, you would have ended up with 3 jellybeans after your single run of 100 trials. There's nothing stopping you from viewing your case as the same sort of thing: a concatenated run of 100 trials with 3 failures. I suspect that the fact that you're having to manually reset the system makes you think of each run as "new", but it's really not.
posted by Johnny Assay at 8:58 PM on January 17, 2015
It doesn't matter that the sampling abruptly stops when there's a failure, assuming that the probability of failure remains the same after a reset. Think of it this way: suppose that instead of crashing with probability p, the system dispensed a jellybean to you and kept running. Given your data, you would have ended up with 3 jellybeans after your single run of 100 trials. There's nothing stopping you from viewing your case as the same sort of thing: a concatenated run of 100 trials with 3 failures. I suspect that the fact that you're having to manually reset the system makes you think of each run as "new", but it's really not.
posted by Johnny Assay at 8:58 PM on January 17, 2015
The binomial distribution isn't quite the right model for this kind of problem -- check out the geometric distribution, which is a special case of the negative binomial with r=1.
posted by un petit cadeau at 8:59 PM on January 17, 2015
An analogy: You roll a die, and count the number of times it takes you to roll a 1. The odds of rolling a 1 are one in six, but I don't think you can expect an average of six attempts to roll a 1... Or can you?You can, actually! The expected number of attempts to get a 1 is 1/(1/6) (from the definition of expectation for the geometric distribution) or 6.
posted by un petit cadeau at 8:59 PM on January 17, 2015
Shoot, yes geometric, serves me right for answering at midnight.
posted by peacheater at 8:59 PM on January 17, 2015
posted by peacheater at 8:59 PM on January 17, 2015
Note that this kind of situation is often expressed by engineers as mean time between failures instead of as a single probability. This is done because many real-world failure situations don't follow a geometric distribution (where every button press has an equal chance of failure), but rather a different model (such as the bathrub curve or just one where failures become more likely over time as a part wears out or a software memory leak eats up RAM).
To put it another way, suppose your system consists entirely of pressing the button over and over again until the button physically breaks. In this case, the probability of failure is not constant and independent; the first press of the brand-new button might have a medium probability of failure (maybe it's defective or installed wrong), the 50th press a lower probability of failure, and the 100th press has a high one because the button is worn out and the plastic cracked.
So the question you need to identify before you can calculate the probability is whether every button press is independent from the last (like rolling dice) or whether the crash becomes more or less likely as time passes. If you're not sure, the data might help you figure it out. Make a histogram of the number of presses until it crashes; do you get a uniform distribution or something that seems to peak around certain ranges?
posted by zachlipton at 10:32 PM on January 17, 2015
To put it another way, suppose your system consists entirely of pressing the button over and over again until the button physically breaks. In this case, the probability of failure is not constant and independent; the first press of the brand-new button might have a medium probability of failure (maybe it's defective or installed wrong), the 50th press a lower probability of failure, and the 100th press has a high one because the button is worn out and the plastic cracked.
So the question you need to identify before you can calculate the probability is whether every button press is independent from the last (like rolling dice) or whether the crash becomes more or less likely as time passes. If you're not sure, the data might help you figure it out. Make a histogram of the number of presses until it crashes; do you get a uniform distribution or something that seems to peak around certain ranges?
posted by zachlipton at 10:32 PM on January 17, 2015
The situation you describe is actually pretty analogous to a question I answered in my very first MeFi comment. Basically, if you assume that each press of your button is an independent, random event, then it doesn't matter if you selectively choose when to end your sampling: as your number of trials approaches infinity, you'll still reach the right answer. (A property called consistency in statistics.)
posted by fifthrider at 12:07 AM on January 18, 2015
posted by fifthrider at 12:07 AM on January 18, 2015
If the button always has the same chance of causing a crash, then yes, the probability of it crashing per press is 3/(31+42+27). If the system is more complicated (e.g., every time you press the button the system gets a little more unstable), then this is not the case.
Here's how it works, using your die example. Say you repeat your experiment of rolling until you get a 1 over and over until you've rolled 6000 times in total and gotten about 1000 1s in total. You can find the average number of rolls-until-1 by adding all the number of rolls per experiment and dividing by the number of experiments. But when you add all the number of rolls per experiment you get 6000 in total, and the number of experiments was 1000, so the average number of rolls until 1 is six.
This reasoning doesn't work if the chance varies based on the history of each experiment, which is why the first paragraph only applies if the button has the same chance of crashing every time.
posted by dfan at 5:58 AM on January 18, 2015
Here's how it works, using your die example. Say you repeat your experiment of rolling until you get a 1 over and over until you've rolled 6000 times in total and gotten about 1000 1s in total. You can find the average number of rolls-until-1 by adding all the number of rolls per experiment and dividing by the number of experiments. But when you add all the number of rolls per experiment you get 6000 in total, and the number of experiments was 1000, so the average number of rolls until 1 is six.
This reasoning doesn't work if the chance varies based on the history of each experiment, which is why the first paragraph only applies if the button has the same chance of crashing every time.
posted by dfan at 5:58 AM on January 18, 2015
Thanks, y'all. Maybe everything in statistics isn't needlessly complicated after all.
I appreciate the explanations and analogies!
posted by Dilligas at 8:30 AM on January 18, 2015
I appreciate the explanations and analogies!
posted by Dilligas at 8:30 AM on January 18, 2015
Probability assumes a model. If each event were unique, the concept of probability itself would be meaningless. Various models were suggested above: probability of a crash is the same for each push of the button, probability of a failure increases with each push of the button, the bathtub curve. But there are others. For example, probability of a crash decreases with each push of the button (like life expectancy increasing once you've made it to a certain age) or probability of a crash increases with each crash (say, if a crash damages it in some way) or probability of a crash increases over time independent of how often the button is pushed. The increases (or decreases) themselves can be linear or logarithmic or exponential or whatever. And, all of these models can be combined in various ways (e.g. crashes make future crashes more likely but it levels off after the tenth crash at which point it becomes increasingly unlikely.)
So you need to decide what your model will be. Then you could simulate it on a computer or calculate it directly (if it's simple enough).
posted by Obscure Reference at 8:30 AM on January 18, 2015
So you need to decide what your model will be. Then you could simulate it on a computer or calculate it directly (if it's simple enough).
posted by Obscure Reference at 8:30 AM on January 18, 2015
« Older Square + IFTTT(?) + ____ = When should I close up... | How can I responsibly send $ and items (books... Newer »
This thread is closed to new comments.
posted by peacheater at 8:45 PM on January 17, 2015