Skip
# Coin flip odds?

(adsbygoogle = window.adsbygoogle || []).push({});

No, he doesn't. This can be demonstrated by considering the probability of a run of at least 10 heads in a series of 11 flips.

There are 2

HHHHHHHHHHH

HHHHHHHHHHT

THHHHHHHHHH

So any proposed exact answer better give us 3/2048 for a sequence of at least 10 consecutive heads out of 11 coin flips. Lame_username's solution fails this test. As rocket88 notes, whether flips 1-10 are all heads, and whether flips 2-11 are all heads, are not independent of each other, so you can not simply multiply probabilities.

posted by DevilsAdvocate at 2:41 PM on January 3, 2008

If you only flip the coin five times, sure.

If you flip the coin six times, seeing HTHHT at least once is in fact more probable than seeing TTTTT at least once.

Out of sixty-four possible sequences, four contain HTHHT:

HTHHTH

HTHHTT

HHTHHT

THTHHT

So the probability is 4/64 that you'll see the sequence HTHHT within a series of 6 flips.

But there's only three sequences which contain TTTTT:

TTTTTT

TTTTTH

HTTTTT

So the probability is only 3/64 that you'll see the sequence TTTTT within a series of 6 flips.

This is because TTTTTT has TTTTT "twice" within it, once on flips 1-5 and once on flips 2-6. There's no 6-flip sequence which has HTHHT twice within it. If we were considering the

posted by DevilsAdvocate at 2:55 PM on January 3, 2008 [1 favorite]

(adsbygoogle = window.adsbygoogle || []).push({});

Post

# Coin flip odds?

January 3, 2008 8:47 AM Subscribe

Statistics question: How can I calculate the odds of flipping a coin to get x consecutive heads anywhere within a data set of y flips. (more inside)

Here's the background: I was attempting to explain the phenomenon that unlikely outcomes become more likely to occur as the number of total events increases, to someone who attributes those unlikely outcomes to evidence of "unseen forces".

So my response was along the lines that if you flip a coin ten times, the odds of flipping ten heads are very slim (1023 to 1 against, I believe), but at some larger number of flips (N), the odds of having ten consecutive heads are even (1:1), and at some yet larger number of flips (M) the odds of

1. How do I calculate N and M?, and

2. Is there a name for this phenomenon? (It's similar to the law of large numbers, but not quite)

Here's the background: I was attempting to explain the phenomenon that unlikely outcomes become more likely to occur as the number of total events increases, to someone who attributes those unlikely outcomes to evidence of "unseen forces".

So my response was along the lines that if you flip a coin ten times, the odds of flipping ten heads are very slim (1023 to 1 against, I believe), but at some larger number of flips (N), the odds of having ten consecutive heads are even (1:1), and at some yet larger number of flips (M) the odds of

*not*having ten consecutive heads is 1023 to 1 against.

1. How do I calculate N and M?, and

2. Is there a name for this phenomenon? (It's similar to the law of large numbers, but not quite)

I'm not a statistics expert, so I don't have a formula that will calculate that, but what you are describing is called a "run".

For example, in TTHTHHHHTHH the longest run of heads is 4.

There was a similar statistics question posted earlier, and there are several suggestions for how to do statistics calculations.

posted by burnmp3s at 8:55 AM on January 3, 2008

For example, in TTHTHHHHTHH the longest run of heads is 4.

There was a similar statistics question posted earlier, and there are several suggestions for how to do statistics calculations.

posted by burnmp3s at 8:55 AM on January 3, 2008

Roughing this out on paper, I come up with:

odds of at least a 10 head run in a sequence of N tosses = sum(from i=0 to 9; 2**(N-10)) / 2**N

So, for 10 tosses = sum(1) / 1024

11 tosses = sum(2,2) / 2048

12 tosses = sum(4,4,4) / 4096

...

posted by cameradv at 9:05 AM on January 3, 2008

odds of at least a 10 head run in a sequence of N tosses = sum(from i=0 to 9; 2**(N-10)) / 2**N

So, for 10 tosses = sum(1) / 1024

11 tosses = sum(2,2) / 2048

12 tosses = sum(4,4,4) / 4096

...

posted by cameradv at 9:05 AM on January 3, 2008

This isn't an exact computation (without thinking too hard about it, I think the exact computation would involve a straightforward but hard-to-type-into-Metafilter linear recurrence) but here's a seat of the pants lower bound. If there's something that happens with very small probability p, but it has T chances to happen, the chance of it NEVER happening is about e^(-pT). So for instance, if you flip 100,000 times, and break the flips into 10,000 bins of ten flips each, the chance of any particular bin being all heads is 1/1024 (p = 1/1024); but the chance that NONE of the 10,000 bins is all heads is about e^-(10,000/1024) which is about .00005. And that ignores the possibility that there might be a run of 10 heads which starts in one bin and finishes in another!

I think it would be correct to call this a manifestation of the law of large numbers, by the way. The LLN predicts that the number of 10-head runs in a large sample approaches the expected proportion, which is small, but -- here is the point -- nonzero.

posted by escabeche at 9:27 AM on January 3, 2008

I think it would be correct to call this a manifestation of the law of large numbers, by the way. The LLN predicts that the number of 10-head runs in a large sample approaches the expected proportion, which is small, but -- here is the point -- nonzero.

posted by escabeche at 9:27 AM on January 3, 2008

Free helping of pedantry: I would call this a probability question, not a statistics question. A statistics question would be more like "Somebody flipped a coin 100,000 times and reported no runs of 10 heads -- should I suspect something's wrong with this data?"

posted by escabeche at 9:28 AM on January 3, 2008

posted by escabeche at 9:28 AM on January 3, 2008

The probability of "n" consecutive heads is .5 ^ n. So for 10 consecutive flips, the probability is .5 ^ 10 or 0.0009765625. The probability of this not happening is 1 - (.5 ^ 10) or 0.999023438.

The probability of not seeing 10 heads in a row can be expressed as (0.999023438) ^ #attempts. Thus at toss 710, it becomes more likely than not that you will have seen at least one run of 10 heads -- (0.999023438 ^ 710 = 0.499724591). If you toss the coin 5,000 times you will see at least one run of ten heads 99.3% of the time. If you toss the coin 10,000 times you will see at least one run of ten heads 99.9942882% of the time.

posted by Lame_username at 9:41 AM on January 3, 2008

The probability of not seeing 10 heads in a row can be expressed as (0.999023438) ^ #attempts. Thus at toss 710, it becomes more likely than not that you will have seen at least one run of 10 heads -- (0.999023438 ^ 710 = 0.499724591). If you toss the coin 5,000 times you will see at least one run of ten heads 99.3% of the time. If you toss the coin 10,000 times you will see at least one run of ten heads 99.9942882% of the time.

posted by Lame_username at 9:41 AM on January 3, 2008

escabeche: I think your concept of dividing total flips into groups of ten is the right way to do this. Eleven flips can be seen as two overlapping groups of ten...twelve is three overlapping groups, and so on. Each group has an independent 1/1024 chance of being all heads, so at 521 total flips, I have 512 groups of ten, each with an equal chance of 1/1024, making the odds 512 in 1024, or even money.

posted by rocket88 at 9:47 AM on January 3, 2008

posted by rocket88 at 9:47 AM on January 3, 2008

On closer look, it appears the above method is wrong, since the groups aren't independent (by nature of their overlapping).

posted by rocket88 at 9:54 AM on January 3, 2008

posted by rocket88 at 9:54 AM on January 3, 2008

There's a fairly decent exposition of this problem here.

posted by Neiltupper at 10:55 AM on January 3, 2008

posted by Neiltupper at 10:55 AM on January 3, 2008

Represent a series of coin flips as a string (S) of H's and T's.

Let S

Define a function V of such a string as follows:

- V(null string)=0

- for all other S,

-- if V(S

-- if V(S

--- if the rightmost character of S is H, V(S)=V(S

--- if the rightmost character of S is T, V(S)=0

(In English: V is the number of consecutive heads occuring at the

Let P(v,N) be the probability that a series of N flips has a function V with value v.

Now we can define P(v,N) recursively:

P(0,0)=1

P(v,0)=0 for v=1,2,...10

for N>0:

P(0,N)=[P(0,N-1)+P(1,N-1)+P(2,N-1)+...+P(9,N-1)]/2

P(v,N)=P(v-1,N-1)/2 for v=1,2,...9

P(10,N)=P(9,N-1)/2+P(10,N-1)

You can set these up in a spreadsheet, which I've done here. P(10,N) gives the probability that there is a run of at least 10 heads in N flips. The magic number N at which P(10,N) first exceeds 1/2 (i.e., has odds better than 1:1) is 1421. Google spreadsheets doesn't give you enough rows to find the point at which the probability exceeds 1023/1024 (i.e., has odds better than 1023:1), but if you do the same thing in Excel, that happens at 14131 flilps.

Or, if you like matrix math, let W

posted by DevilsAdvocate at 11:42 AM on January 3, 2008

Let S

_{l}represent S with the rightmost character removed.Define a function V of such a string as follows:

- V(null string)=0

- for all other S,

-- if V(S

_{l})=10, V(S)=10-- if V(S

_{l})<10,--- if the rightmost character of S is H, V(S)=V(S

_{l})+1--- if the rightmost character of S is T, V(S)=0

(In English: V is the number of consecutive heads occuring at the

*end*of the series of flips, unless there have been ten consecutive heads at any point in the series, in which case V is 10.)Let P(v,N) be the probability that a series of N flips has a function V with value v.

Now we can define P(v,N) recursively:

P(0,0)=1

P(v,0)=0 for v=1,2,...10

for N>0:

P(0,N)=[P(0,N-1)+P(1,N-1)+P(2,N-1)+...+P(9,N-1)]/2

P(v,N)=P(v-1,N-1)/2 for v=1,2,...9

P(10,N)=P(9,N-1)/2+P(10,N-1)

You can set these up in a spreadsheet, which I've done here. P(10,N) gives the probability that there is a run of at least 10 heads in N flips. The magic number N at which P(10,N) first exceeds 1/2 (i.e., has odds better than 1:1) is 1421. Google spreadsheets doesn't give you enough rows to find the point at which the probability exceeds 1023/1024 (i.e., has odds better than 1023:1), but if you do the same thing in Excel, that happens at 14131 flilps.

Or, if you like matrix math, let W

_{N}be the vector:- - | P(0,N)| | P(1,N)| | P(2,N)| | P(3,N)| | P(4,N)| WThen you can calculate it as W_{N}=| P(5,N)| | P(6,N)| | P(7,N)| | P(8,N)| | P(9,N)| |P(10,N)| - -

_{N}=T^{N}W_{0}, where- - | 1 | | 0 | | 0 | | 0 | | 0 | Wand_{0}=| 0 | | 0 | | 0 | | 0 | | 0 | | 0 | - -

- - |1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 0 | |1/2 0 0 0 0 0 0 0 0 0 0 | | 0 1/2 0 0 0 0 0 0 0 0 0 | | 0 0 1/2 0 0 0 0 0 0 0 0 | | 0 0 0 1/2 0 0 0 0 0 0 0 | T=| 0 0 0 0 1/2 0 0 0 0 0 0 | | 0 0 0 0 0 1/2 0 0 0 0 0 | | 0 0 0 0 0 0 1/2 0 0 0 0 | | 0 0 0 0 0 0 0 1/2 0 0 0 | | 0 0 0 0 0 0 0 0 1/2 0 0 | | 0 0 0 0 0 0 0 0 0 1/2 1 | - -

posted by DevilsAdvocate at 11:42 AM on January 3, 2008

Lame_username has the answer.

Would like to adjust your lingo a bit, so that you can be correct in your claim:

An event does not become "more likely to occur" within the total possible number of events. Rather, you are more likely to

Basically, the claim should be that as you watch more, you will see a wider range of outcomes occur, so you are more likely to see something unusual. If you spend a lot of time out under the open sky at night, you are more likely to see a shooting star. If you flip a coin a million times, you are more likely to see 25 heads in a row than if you flipped the coin only 100 times.

posted by zennie at 2:04 PM on January 3, 2008

Would like to adjust your lingo a bit, so that you can be correct in your claim:

*I was attempting to explain the phenomenon that unlikely outcomes become more likely to occur as the number of total events increases, to someone who attributes those unlikely outcomes to evidence of "unseen forces".*An event does not become "more likely to occur" within the total possible number of events. Rather, you are more likely to

__observe__the event as you increase the number of observations. The probability of a penny falling on head or tail is always theoretically the same: 0.5 for each side. No matter how many times you flip the penny. However, the likelihood that you will see the event increases with the number of observations.Basically, the claim should be that as you watch more, you will see a wider range of outcomes occur, so you are more likely to see something unusual. If you spend a lot of time out under the open sky at night, you are more likely to see a shooting star. If you flip a coin a million times, you are more likely to see 25 heads in a row than if you flipped the coin only 100 times.

posted by zennie at 2:04 PM on January 3, 2008

(Dang, hit post prematurely.)

If you have seen something occur a lot of times-- for example, a coin flipped 5 times and not coming out 5 tails in a row-- it will appear, to you, to be an extraordinary/unusual event. Really, it's no more extraordinary than the coin landing in any specific mix of heads and tails.

T T T T T has the same probability of occurring as H T H H T.

The only difference is you notice it happened and perceive it to be unusual.

posted by zennie at 2:13 PM on January 3, 2008

If you have seen something occur a lot of times-- for example, a coin flipped 5 times and not coming out 5 tails in a row-- it will appear, to you, to be an extraordinary/unusual event. Really, it's no more extraordinary than the coin landing in any specific mix of heads and tails.

T T T T T has the same probability of occurring as H T H H T.

The only difference is you notice it happened and perceive it to be unusual.

posted by zennie at 2:13 PM on January 3, 2008

*Lame_username has the answer.*

No, he doesn't. This can be demonstrated by considering the probability of a run of at least 10 heads in a series of 11 flips.

There are 2

^{11}=2048 possible sequences of 11 flips, each equally likely. Only 3 of those 2048 contain a run of at least 10 heads:

HHHHHHHHHHH

HHHHHHHHHHT

THHHHHHHHHH

So any proposed exact answer better give us 3/2048 for a sequence of at least 10 consecutive heads out of 11 coin flips. Lame_username's solution fails this test. As rocket88 notes, whether flips 1-10 are all heads, and whether flips 2-11 are all heads, are not independent of each other, so you can not simply multiply probabilities.

posted by DevilsAdvocate at 2:41 PM on January 3, 2008

*T T T T T has the same probability of occurring as H T H H T*

If you only flip the coin five times, sure.

If you flip the coin six times, seeing HTHHT at least once is in fact more probable than seeing TTTTT at least once.

Out of sixty-four possible sequences, four contain HTHHT:

HTHHTH

HTHHTT

HHTHHT

THTHHT

So the probability is 4/64 that you'll see the sequence HTHHT within a series of 6 flips.

But there's only three sequences which contain TTTTT:

TTTTTT

TTTTTH

HTTTTT

So the probability is only 3/64 that you'll see the sequence TTTTT within a series of 6 flips.

This is because TTTTTT has TTTTT "twice" within it, once on flips 1-5 and once on flips 2-6. There's no 6-flip sequence which has HTHHT twice within it. If we were considering the

*number*of times you expected to see TTTTT or HTHHT within a sequence of 6 flips, you'd come up with 4/64 for both. But that's not what's being asked: the question is the probability of seeing the given sequence

*at least once*, and in that case the probabilities are not necessarily equal.

posted by DevilsAdvocate at 2:55 PM on January 3, 2008 [1 favorite]

First Google result for "probability of a run". It's a nice short paper but the math is not trivial. (Note that this paper is newer than the Weisstein paper that Neiltupper links to at MathWorld, and in fact cites it to say "The most detailed presentation of the theory of runs on the internet is to be found in Weisstein, but the explicit formula is not even mentioned there!")

In slightly clumsy notation, the explicit formula the paper claims to prove is:

y(n) = 1 - ( β(n,r) − p^r x β(n−r,r) )

where p := the probability of a single success and

β(n,r) := the sum as L varies from 0 to int(n/(r+1)) of (−1)^L x b(n − Lr, L) x (q p^r)^L

where b(n, k) := the kth binomial coefficient of the nth degree, or n!/k!(n-k)!.

Unless this intimidates you a lot less than it does me, I'd give up on proving it with math, and go with a more intuitive inductive argument: as DevilsAdvocate shows it's easy by enumeration and division to show that a run of ten is more likely in 11 flips than in ten, and more likely in 12 flips than that; since the likelihood of a run increases with the number of flips, it will eventually become as likely as 1023/1024. (If he brings up Zeno's paradox or asymptotes, happy-slap him and run away.)

posted by nicwolff at 4:06 PM on January 3, 2008

In slightly clumsy notation, the explicit formula the paper claims to prove is:

y(n) = 1 - ( β(n,r) − p^r x β(n−r,r) )

where p := the probability of a single success and

β(n,r) := the sum as L varies from 0 to int(n/(r+1)) of (−1)^L x b(n − Lr, L) x (q p^r)^L

where b(n, k) := the kth binomial coefficient of the nth degree, or n!/k!(n-k)!.

Unless this intimidates you a lot less than it does me, I'd give up on proving it with math, and go with a more intuitive inductive argument: as DevilsAdvocate shows it's easy by enumeration and division to show that a run of ten is more likely in 11 flips than in ten, and more likely in 12 flips than that; since the likelihood of a run increases with the number of flips, it will eventually become as likely as 1023/1024. (If he brings up Zeno's paradox or asymptotes, happy-slap him and run away.)

posted by nicwolff at 4:06 PM on January 3, 2008

For the purposes of demonstrating to your superstitious friend the general principle that remarkable events are more likely in larger samples, a computer simulation ought to suffice, and would probably be more convincing than the rather elaborate calculation of the probability you're asking about.

posted by Coventry at 7:12 PM on January 3, 2008

posted by Coventry at 7:12 PM on January 3, 2008

Just realized it's possible to significantly simplify my earlier solution by expressing everything in terms of P(10,N) and getting rid of all the other terms.

Since P(0,N)+P(1,N)+P(2,N)+...+P(10,N)=1, P(0,N)=1/2[1-P(10,N-1)] for N>0.

Since P(v,N)=1/2P(v-1,N-1) for 1≤v≤9, we can also say P(v,N)=2

From these along with the earlier relations, we get P(10,N)=P(10,N-1)+P(0,N-10)/1024 for N≥10;

and then P(10,N)=P(10,N-1)+((1-P(10,N-11))/2)/1024

To simplify the expressions and show that this can be determined solely by the P(10,N) terms, let's define Q(N)=P(10,N)

And we can express the probability as:

posted by DevilsAdvocate at 7:52 PM on January 3, 2008

Since P(0,N)+P(1,N)+P(2,N)+...+P(10,N)=1, P(0,N)=1/2[1-P(10,N-1)] for N>0.

Since P(v,N)=1/2P(v-1,N-1) for 1≤v≤9, we can also say P(v,N)=2

^{-v}P(0,N-v) for 1≤v≤9, N≥v.From these along with the earlier relations, we get P(10,N)=P(10,N-1)+P(0,N-10)/1024 for N≥10;

and then P(10,N)=P(10,N-1)+((1-P(10,N-11))/2)/1024

To simplify the expressions and show that this can be determined solely by the P(10,N) terms, let's define Q(N)=P(10,N)

And we can express the probability as:

Q(N) = 0, for N<10 1/1024, for N=10 Q(N-1) - Q(N-11)/2048 + 1/2048, for N>10It may be possible to find a closed-form solution for this, but that would involve finding the roots (real and complex) of 2048x

^{11}- 2048x^{10}+1, so I'll let someone else take a stab at that if they want to.posted by DevilsAdvocate at 7:52 PM on January 3, 2008

*sigh* I should never attempt math when I don't have time to work through it.

posted by zennie at 7:53 AM on January 4, 2008

posted by zennie at 7:53 AM on January 4, 2008

Natalie Angier once mentioned this very same problem in a lecture. She talked about a statistics professor who used this problem as a warm up exercise. She divided the class up into two groups. One group was told to flip a coin some n number of times and record the number of times it came up head/tails. The other group was asked to imagine that they were flipping a coin and write down a n-length sequence of heads and tails. She would leave the room while they were doing this so that there was no way for her to figure out which person was in which group. After collecting all the papers she would effortlessly divide the papers into two groups. How? Because the people who were faking it were invariably too careful not to include long runs of heads and tails. The truly random data invariably contained several long runs of the same type.

posted by peacheater at 7:24 PM on January 4, 2008

posted by peacheater at 7:24 PM on January 4, 2008

This thread is closed to new comments.

posted by signal at 8:53 AM on January 3, 2008