The Gambler's Fallacy, of course. But maybe...January 8, 2014 9:37 AM   Subscribe

A coin flips three times and comes up heads 2/3. Not suspect. But a coin flips 100,000 times and comes up heads 2 out of 3 times, that starts to look fishy. The standard probability of this is always roughly 50-50, but assuming a 2/3 ratio pointing to a "rigged" coin, how could you plot the increasing likelihood that a given coin is rigged?

I've always come back to this when people discuss the Gambler's Fallacy. "The probability of x is ALWAYS 50/50." But how does that probability change, as the likelihood that a rigged coin is in play increases?

The Gambler's Fallacy says the probability of a pull never changes, regardless of streaks. But suppose the streak points to a hidden fact, again, over a large number of pulls. How could those streaks be used to point to a statistical anomaly, or signify that perhaps the playing field isn't as level as it could be?
posted by ASoze to Grab Bag (17 answers total) 5 users marked this as a favorite

Would some element of the CLT be relevant here, with the conventional expectation that things should normalize somewhere between 30-50 observations?
posted by .kobayashi. at 9:46 AM on January 8, 2014

This is what p-value is - how likely it is that some result occurred by chance.
posted by ssg at 9:49 AM on January 8, 2014 [3 favorites]

Yup, p-value; this concept is also referred to with the term "statistical significance" - that is, you'd be looking for a result that is statistically significantly different from 50/50. This also takes the sample size (the number of flips of the coin) into consideration.
posted by entropone at 9:51 AM on January 8, 2014 [2 favorites]

The difference between the observed proportion of heads and the true, unknown proportion approximately follows a normal distribution with a known variance, with the approximation being of increasing quality as number of flips increase. Moreover, the variance is decreasing in the number of flips. Therefore, as the number of flips increases, if the observed proportion of heads stays constant (say at 2/3), then the strength of the evidence that the coin is not fair is increasing.
posted by deadweightloss at 9:51 AM on January 8, 2014 [1 favorite]

See also the aptly named Wikipedia article Checking whether a coin is fair.
posted by bassooner at 9:53 AM on January 8, 2014 [1 favorite]

A frequentist would say- the coin is rigged, or it is not; whether the coin is rigged is not up to chance. But you can tell what the probability of observing results as extreme as yours, given a fair coin, and if that probability goes low enough, then you can reject the idea that this is a fair coin. More explanation here.

A Bayesian would say- the probability here is about our belief. Our belief at first is that the coin could be rigged to give any frequency of heads vs. tails. Later observations will cause us to refine that belief.
posted by Jpfed at 9:54 AM on January 8, 2014 [4 favorites]

A coin, by definition isn't necessarily fair. Just because it has 2 sides and is called a coin doesn't make it fair. Its the fact that it comes up on one approaching 50% the more you flip it.

So if you flip a coin and continue getting one side significantly more than the other, it is NOT a fair coin.
posted by hal_c_on at 10:00 AM on January 8, 2014

I'd like to know more about the frequentist vs Bayesian thing. Because I got beat up in a conversation only recently about this exact thing. Here was the puzzle at hand:

A jar has 1000 coins, of which 999 are fair and 1 is double headed. Pick a coin at random, and toss it [the same coin] 10 times. Given that you see 10 heads, what is the probability that the next toss of that coin is also a head?

So apparently I'm a frequentist because I insisted that p(heads) was 0.505 (.999 chance of a fair coin, .001 chance of a rigged heads coin). The fate was decided when you picked the coin. The Bayesians insisted this was totally wrong and the probability was much higher since it's now a much higher certainty than before that the coin is the rigged one.

Can there really be a difference here?
posted by JoeZydeco at 10:06 AM on January 8, 2014 [1 favorite]

I don't know a whole lot about statistics, but I am versed in combinatorics. You can calculate exactly how likely such an outcome is to occur with a fair coin in the following way:

First, the number of ways you can get 33,334 tails out of 100,000 flips is computed with this formula:

n!/k!(n-k)!

Where n is the total number of flips, and k is the number of tails. So, 100,000!/(33,334!66,666!), which is nowhere near 2^100,000, the number of all possible outcomes of your coin-flipping spree. So the probability of getting exactly that number of tails is (100,000!/33,334!66,666!) / 2^100,000.

So, suppose you only get suspicious of a coin's fairness when it comes up tails 1/3 of the time or less. If you want to know how likely a rigged coin is, what you really want is to count the number of ways you can get AT MOST 33,334 tails. Ergo:

sum k=1 to 33,334 (n!/k!(n-k)!)

where n is 100,000 again, and then divide THAT by 2^100,000 to get the probability that a fair coin would behave in such a way. (Wolfram Alpha tells me it's about 1.2 * 10^-2462, i.e. stupefyingly small.)
posted by Androgenes at 10:26 AM on January 8, 2014 [2 favorites]

So apparently I'm a frequentist because I insisted that p(heads) was 0.505 (.999 chance of a fair coin, .001 chance of a rigged heads coin). The fate was decided when you picked the coin. The Bayesians insisted this was totally wrong and the probability was much higher since it's now a much higher certainty than before that the coin is the rigged one.

I think I'll take the other side of this one in two different ways.

1) If you carry your argument out to an infinite number of tosses do you still think p(heads) would be 0.505? If you get 1000 consecutive heads will you still think the next flip is just a bit better than 50/50 for heads?

2) If you get a tails in your 10 flips, you can exclude the double headed coin and would agree that p(heads) was 0.500. That uses information after you picked the coin to change the probability - if a tails flip can do that why can't a heads flip?
posted by true at 10:28 AM on January 8, 2014 [4 favorites]

So apparently I'm a frequentist because I insisted that p(heads) was 0.505 (.999 chance of a fair coin, .001 chance of a rigged heads coin). The fate was decided when you picked the coin. The Bayesians insisted this was totally wrong and the probability was much higher since it's now a much higher certainty than before that the coin is the rigged one.

Let's say you toss the coin a million times. It comes up heads every single time. Do you still think the probability of the next toss being heads is 0.505? If I then offered you a bet that you would give me \$10 if it came up heads and I gave you \$1000 if it came up tails, would you take that bet? Call your model whatever you want (and I don't think a frequentist statistician would agree with your approach), but it's not one that is useful.
posted by grouse at 10:30 AM on January 8, 2014

This is what p-value is - how likely it is that some result occurred by chance.
More accurately, it is where observed data falls on a a given known probability distribution function used for statistical testing purposes. This function is different for different kinds of statistical tests/data, and in your case of a binary type event like a coin flip, it's a binomial function.

So basically what you're asking for is a plot of the binomial distribution of a known fair coin against that of a suspect coin. The probability of a coin coming up heads X times out of N flips is described by a cumulative binomial distribution, and a fair coin would follow the distribution function with p=0.5 for any individual flip, over N flips. Streaks don't matter in the math, only total numbers of events (each flip being an event). You could visualize evidence that the coins are different - so for instance the difference between the blue and green dotted lines in the second figure in the wiki article would describe the distribution for a fair coin over 20 flips and an unfair coin with p=0.7 over 20 flips. Or you could take the known distribution for your fair coin, see where your X value for your suspect coin lies on that distribution, and assign a p value to that (different than the first p for the individual event I described), and that's what the bionomial statistical test does. But the big point is that the bionomial distribution does change with both p (the "fairness" of your coin) and N (number of flips) and describes the total range of what we could expect given N flips with a p coin, and this is the entire basis of all the stats stuff people are hinting at in the above answers. That's why it remains true that the probability for a given flip never ever changes with the same coin even though the cumulative probability of X heads out of N flips does change.

(on preview, the math Androgenes just laid out is also what the wiki is describing)
posted by slow graffiti at 10:32 AM on January 8, 2014

This is what p-value is - how likely it is that some result occurred by chance.

No, the p-value is the probability that you get a result like this more or extreme given that the null hypothesis is true. See "misunderstandings" in the Wikipedia p-value article.
posted by grouse at 10:34 AM on January 8, 2014 [2 favorites]

So apparently I'm a frequentist because I insisted that p(heads) was 0.505 (.999 chance of a fair coin, .001 chance of a rigged heads coin).

No, a frequentist would consider the results of running the experiment many times (in contrast to the degree-of-belief calculation a Bayesian might perform). If you run your experiment one million times, for example, you'll pick the rigged coin 0.1% of the time (thereby obtaining a guaranteed 10 heads 1000 times) and a non-rigged coin 99.9% of the time (thereby obtaining the relatively rare 10 heads about 976 times). After the 10 heads, you have the rigged coin 50.6% of the time. The next flip comes up heads 75.2% of the time, not 50.5% of the time.

(As I understand it, a Bayesian would come up with the same result but would feel confident in expressing the state of your mystery coin in terms of a likelihood, even though its nature has already been decided.)
posted by Mapes at 12:13 PM on January 8, 2014 [2 favorites]

Determining whether your coin is fair (p(H)=0.5) or rigged (p(H)=0.66) would be done by calculating the odds ratio, whether you were taking a frequentist or a Bayesian approach. (See also: Hypothesis testing, and the wikipedia link about the Bayes factor.)

As your streak (or run of experiments) got longer and longer, the odds ratio would change to express your relative confidence between the two different hypotheses.
posted by RedOrGreen at 12:43 PM on January 8, 2014 [1 favorite]

I think Mapes explained the flaw in my logic the best. Thanks!
posted by JoeZydeco at 1:21 PM on January 8, 2014

Hmm ... I would start out by assigning a prior probability over the space of possible values for the bias of the coin. Use a beta distribution to do this. If you're like Laplace, you'll want a uniform or flat prior, which you get by setting both of the shape parameters of the beta distribution to one. (Usually the shape parameters are called "alpha" and "beta," which I find a bit unfortunate, but there you go.)

Then update your prior on the basis of your evidence. The beta distribution is a conjugate prior, so the update step turns out to be really easy: just add the number of observed successes to your initial assignment to the first shape parameter (alpha) and add the number of failures you observed to your initial assignment to the second shape parameter (beta).

In your case, you flip 100,000 times and see something like 66,667 heads (let's call those successes) and 33,333 tails (the failures). Then your posterior beta distribution has alpha equal to 66,668 and beta equal to 33,334. Most of the posterior distribution is at zero up to machine error, but zoomed in, it looks like this. If you know how to use R, you can use code supplied by John Kruschke here to get a 95% highest density credible interval for the bias towards heads, which turns out to be about (0.664, 0.670). A 99% HDI credible interval isn't much different: (0.663, 0.671). Given the evidence, I would conclude that the coin (or rather, the chance process -- which includes how the coin is flipped) is very probably biased.

Of course, if you were completely convinced prior to flipping the coin that the coin was fair, and I mean completely convinced, then no matter how many heads you saw as you flipped the coin, you would still maintain that the coin was fair. ;)
posted by Jonathan Livengood at 1:52 AM on January 9, 2014

« Older Wrinkle-free napkins?   |   What does "I don't want to talk about it" mean... Newer »