April 25, 2010 11:03 AM Subscribe

Is there a probabilistic or scientific argument that reasons "the longer you've been waiting, the longer you can expect to wait?"

I'll start with a disclaimer: I don't know if the statement exists or even if it's correct. If it is, I don't know what it would be called. On the other hand, something suggests to me that it would come from queueing theory. I can also accept that this is nothing and I'm completely crazy.

I can't think of a good example, so please bear with me, especially the geologists in the room. Let's suppose we have volcanoes A and B. Volcano A erupted a little over a month ago, and volcano B's most recent eruption was just over three hundred years ago. One argument is that volcano B is long overdue for an eruption, so B is upcoming. This argument though, suggests that we should expect to wait even longer for B, and concludes that A is upcoming.

One of the conditions for this argument to work is if we ignore history. If we know that B erupts every three hundred years or so, and that A erupts once per millennium, then the argument shouldn't apply.

So, these are the questions to be answered:

1. First and foremost, is this even correct?

2. If correct, when can it be applied, and when can it not?

3. If correct, is it better known with a proper name?

4. Is there a prototypical example that demonstrates this clearer?

Thanks in advance!
posted by cavedirt to Science & Nature (14 answers total) 4 users marked this as a favorite

I'll start with a disclaimer: I don't know if the statement exists or even if it's correct. If it is, I don't know what it would be called. On the other hand, something suggests to me that it would come from queueing theory. I can also accept that this is nothing and I'm completely crazy.

I can't think of a good example, so please bear with me, especially the geologists in the room. Let's suppose we have volcanoes A and B. Volcano A erupted a little over a month ago, and volcano B's most recent eruption was just over three hundred years ago. One argument is that volcano B is long overdue for an eruption, so B is upcoming. This argument though, suggests that we should expect to wait even longer for B, and concludes that A is upcoming.

One of the conditions for this argument to work is if we ignore history. If we know that B erupts every three hundred years or so, and that A erupts once per millennium, then the argument shouldn't apply.

So, these are the questions to be answered:

1. First and foremost, is this even correct?

2. If correct, when can it be applied, and when can it not?

3. If correct, is it better known with a proper name?

4. Is there a prototypical example that demonstrates this clearer?

Thanks in advance!

Actually, the above is wrong. Rather than the uninformative prior on the mean waiting times, we need a prior such that after some point larger waiting times are less likely. I think that works from there.

posted by PMdixon at 11:20 AM on April 25, 2010

posted by PMdixon at 11:20 AM on April 25, 2010

A burn-in period is a similar concept. The failure rate for many systems follows a bathtub curve: either the system is defective, in which case it'll probably fail quickly, or else it's not, in which case it'll last a long time and then wear out.

I believe the essential thing here is that the event we're waiting for depends on (at least) two other events--one which becomes less likely over time, and one which becomes more likely.

posted by equalpants at 11:37 AM on April 25, 2010

I believe the essential thing here is that the event we're waiting for depends on (at least) two other events--one which becomes less likely over time, and one which becomes more likely.

posted by equalpants at 11:37 AM on April 25, 2010

It sounds like you are interested in memoryless distributions. The exponential distribution is memoryless: no matter how long you've been waiting, your expected remaining wait time is the same. Other distributions will not have this property. My Romanian probability professor used the example of wait times for a bus. In Romania, no matter how long you have waited for a bus the probability of it arriving in the next N minutes is constant. In the US, as you wait longer the probability of it arriving in the next N minutes increases.

I think the answer to questions 1 and 2 is you need more information, the average time between eruptions is not sufficient, you need to know the distribution. For question 3, economists would say that it is a question of whether the hazard rate (=probability of eruption as a function of time since last eruption) is upward or downward sloping.

posted by thrako at 11:53 AM on April 25, 2010

I think the answer to questions 1 and 2 is you need more information, the average time between eruptions is not sufficient, you need to know the distribution. For question 3, economists would say that it is a question of whether the hazard rate (=probability of eruption as a function of time since last eruption) is upward or downward sloping.

posted by thrako at 11:53 AM on April 25, 2010

There is at least a reverse phenomenon that sounds scientifically robust to me: if you walk around and suddenly see band play on the street, then *on average* you arrive half-way in that concert. So if they play for one hour more, then *probably* the total concert duration was around 2 hours.

Mathematically, if events are randomly distributed over time, and you hit an event at a certain time, on average you hit half-way. So the best estimate of the duration before you arrived, is the duration afterward.

posted by willem at 12:00 PM on April 25, 2010

Mathematically, if events are randomly distributed over time, and you hit an event at a certain time, on average you hit half-way. So the best estimate of the duration before you arrived, is the duration afterward.

posted by willem at 12:00 PM on April 25, 2010

What you're asking is, are there situations where events are dependent such that each event reduces the probability of the event?

Well of course.

Most commonly when there is a finite limit to the occurrence of events. An obvious example would be the probability of drawing aces from the deck:

- from a full deck, the likelihood of drawing an ace is 4/52 (7.7%)

- if draw an ace immediately, the likelihood of drawing an ace again then becomes 3/51 (5.9%)

- if you don't draw an ace immediately, the likelihood of drawing an ace increases to 4/51 (7.8%)

With each successive draw, the likelihood of the next draw being an ace increases if you didn't just draw an ace, and decreases if you just drew an ace.

posted by randomstriker at 12:01 PM on April 25, 2010

Well of course.

Most commonly when there is a finite limit to the occurrence of events. An obvious example would be the probability of drawing aces from the deck:

- from a full deck, the likelihood of drawing an ace is 4/52 (7.7%)

- if draw an ace immediately, the likelihood of drawing an ace again then becomes 3/51 (5.9%)

- if you don't draw an ace immediately, the likelihood of drawing an ace increases to 4/51 (7.8%)

With each successive draw, the likelihood of the next draw being an ace increases if you didn't just draw an ace, and decreases if you just drew an ace.

posted by randomstriker at 12:01 PM on April 25, 2010

Both PMdixon and equalpants are correct, depending on the specific situation at hand. The essential concept is memorylessness. A process is memoryless if, essentially, "the amount of time that I've been waiting is independent of the time to the next event".

If you have a process which is not memoryless, such as equalpants describes, then by definition the amount of time you've waiting so far gives you information about how long you have to go. A mixture distribution such as equalpants describes is one example of this, but there are many others.

Even if the process is memoryless, your inference can be correct IF you don't know the mean waiting time. (If you DO know the mean waiting time, then your inference is a fallacy called the gambler's fallacy.) The logic is as PMdixon describes.

Note that every prior has the property "after some point larger waiting times are less likely", as long as it allows that any waiting time in [0, inf) is possible.

posted by sesquipedalian at 12:05 PM on April 25, 2010

If you have a process which is not memoryless, such as equalpants describes, then by definition the amount of time you've waiting so far gives you information about how long you have to go. A mixture distribution such as equalpants describes is one example of this, but there are many others.

Even if the process is memoryless, your inference can be correct IF you don't know the mean waiting time. (If you DO know the mean waiting time, then your inference is a fallacy called the gambler's fallacy.) The logic is as PMdixon describes.

Note that every prior has the property "after some point larger waiting times are less likely", as long as it allows that any waiting time in [0, inf) is possible.

posted by sesquipedalian at 12:05 PM on April 25, 2010

Wow, thanks everyone for the replies so far! I'm not very familiar with statistics, so some of the answers are a bit beyond me, but I'll try to do enough reading so that I can understand all of them. If you don't mind however, I'd like to follow up with this question in the interim:

5. Does this mean that the probability that A erupts in the next year is the same as the probability that B erupts in the next year? If not, how do they compare?

posted by cavedirt at 12:55 PM on April 25, 2010

5. Does this mean that the probability that A erupts in the next year is the same as the probability that B erupts in the next year? If not, how do they compare?

posted by cavedirt at 12:55 PM on April 25, 2010

If we're only given what you said in the question--that A erupted last month and B 300 years ago--then we don't have enough information to know that.

posted by equalpants at 1:24 PM on April 25, 2010

I think the simplest way to look at it is this:

Assume that each volcano has some sort of "expected time between eruptions" or similar parameter we will call E.

Volcano A's E-value is somewhat unlikely to be larger because we observed an eruption recently - and as generic observers, we would be less likely to observe an eruption for volcanos with large E-values.

Volcano B's E-value is even more unlikely to be smaller, because it is extremely unlikely that a volcano that generally erupts frequently would go for 300 years without erupting.

posted by Earl the Polliwog at 3:01 PM on April 25, 2010

Assume that each volcano has some sort of "expected time between eruptions" or similar parameter we will call E.

Volcano A's E-value is somewhat unlikely to be larger because we observed an eruption recently - and as generic observers, we would be less likely to observe an eruption for volcanos with large E-values.

Volcano B's E-value is even more unlikely to be smaller, because it is extremely unlikely that a volcano that generally erupts frequently would go for 300 years without erupting.

posted by Earl the Polliwog at 3:01 PM on April 25, 2010

I have no statistics background either. But what I think you're talking about reminds me of something that I think about: "every No gets you one step closer to a Yes," which people say about resumes/job-hunting, for example. Well, if you know that out of one million job openings, one of them is guaranteed to accept you, then yes, every "no" is just one "no" out of the way and you're one step closer to a "yes." But if you have no clue whether you'll ever get a job offer, to me, every "no" makes it more probable that you'll never get a "yes," that something about you is inherently unemployable or some other reason. So that's kind of similar to your question about waiting for the volcanoes, as I understand it. If we know that the volcano is guaranteed to erupt at some point, then as time passes, as we wait, the closer we're getting to that inevitable eventual eruption. But if we have no idea whether the volcano will ever erupt, the longer we wait without seeing an eruption makes it more likely that it might never erupt--maybe it's dormant or what have you. But again, I have absolutely no statistics under my belt so this could be completely wrong.

posted by thebazilist at 3:47 PM on April 25, 2010

posted by thebazilist at 3:47 PM on April 25, 2010

I'll offer an anthropic argument. Absent any other information, if you assume you are not special, then the time you have been waiting should not be much longer than the average wait time. Turning this around, it implies that the average wait time is as long or longer than the time you have been waiting. The longer the average wait time, the less likely it is that your wait will be over in the next t seconds. (I am assuming that the probability that the event "done waiting" will occur in the next infinitesimal amount of time stays constant and is independent of the time spent waiting so far). This reasoning may or may not be fruitful for individuals applying it.

posted by dsword at 5:07 AM on April 26, 2010

posted by dsword at 5:07 AM on April 26, 2010

This thread is closed to new comments.

Suppose time between events is, say, exponentially distributed. (I think this argument or something like it works for other distributions.) If the only observation we have is that we have not observed an event in the past T time, then under the uninformative prior our MLE of the mean waiting-time before events is T. Hence we should expect to wait T longer.

I think this is mostly roughly correct. Real statisticians feel free to correct me.

posted by PMdixon at 11:12 AM on April 25, 2010