Canadian calculus
April 9, 2010 10:26 AM   Subscribe

Mathfilter. Two part question. Bonus: involves hockey and beer!

So Bud Light is selling cases of 28 bottles (yes, 28) with NHL team logos on the caps. I swear I am collecting these for a friend (not a huge burden as I do enjoy a frosty BL)

So...

1. There are 30 teams in the NHL. I bought 3 cases and have 27 of the 30 teams. Am I lucky, unlucky or right on probability-wise?

2. How many more cases SHOULD get me to 30/30.

It seems to me each bottle has a 1/30 chance of being any given team, therefore 3/30 chance of being one of the 3 I need with diminishing odds as I approach 30. But 10% per bottle x 28/case obviously doesn't compute, just as if you roll a die 6 times you don't have a 100% chance of getting a given number. What am I missing? I know I should be able to figure this out but my brain is starting to melt and hoping a few peeps might have fun with this.
posted by raider to Education (25 answers total)
 
The answer to your question depends upon whether all teams are equally represented in the population. It further depends upon whether the beer distributor who sells beer to wherever you get your beer from has received a random sample of beer from the bottling plant.

All of which is to say is that it is impossible for us to answer your question definitively.
posted by dfriedman at 10:30 AM on April 9, 2010 [1 favorite]


Building on what dfriedman said: Anheuser-Busch isn't doing this because they love hockey, they're doing it because they love to sell beer. And they know that people will attempt to collect all 30 teams. If I was in charge of this promotion, I'd figure out what the smallest-market or most-unpopular NHL team is (maybe one of the two teams in Florida), then make sure their caps were very, very underrepresented.
posted by box at 10:37 AM on April 9, 2010


It seems to me each bottle has a 1/30 chance of being any given team

This is not at all a given. There might be various reasons why they wouldn't have an even distribution of teams on the caps (due to relative popularity of the teams, cost to make each cap, purposely increasing the collectible value of same caps, etc.).

therefore 3/30 chance of being one of the 3 I need with diminishing odds as I approach 30. But 10% per bottle x 28/case obviously doesn't compute, just as if you roll a die 6 times you don't have a 100% chance of getting a given number.

The calculation you are looking for is basically the odds that you will not find the caps you are looking for in X new bottles. If you are really right about the 1/30 chance for each, the odds of each new bottle being useless right now is 27/30 = 9/10. So if you want to know the odds of coming up empty after buying X new bottles, calculate .9^X. For one new case of 28, that would be around a 5% chance of getting all teams you already have. If you get it down to one cap left that you need, you'll have a 40% chance of not getting it in a new case. Again, this all depends on the odds being 1/30 and not some lower number.
posted by burnmp3s at 10:39 AM on April 9, 2010


I'm going to assume that you're correct about the 30 teams all being equally probable, and since you've ended up with 27/30 already then it must be true that duplicates are allowed within a single case.

As burnmp3s just said, what you do is figure out the probability that each additional bottle is not one of the 3 you still need, and multiply that probability by itself for each new bottle.

So there is a 90% chance (or .9 probability) that a single additional bottle is NOT one of the ones you need. For a case of 28, multiply .9 by itself 28 times (.9^28) and you get a 0.05 probability of not getting one of the ones you still need in the next case. Or 95% chance that you WILL.

So what are the chances you'll get all three of the missing ones in your next case? I think it's just .95 cubed, which is .85 roughly. So there's an 85% chance you'll get all 3 of the missing ones in your next case.

How about if you buy 2 more cases? Now there are 56 new bottles instead of 28. Going through the math again with that number of bottles, it looks like about a 99% chance you'll get the 3 missing ones in the next 2 cases.

Theoretically you might never get all of them, but after 2 more cases you're 99% sure you will. If all 30 are equally likely, which as others have pointed out, for marketing reasons they might not be. And I might also be wrong.
posted by FishBike at 10:46 AM on April 9, 2010


Assuming that each team is equally probable and independent of the team on any other bottle, it's easiest to look at it bottle by bottle, i.e., imagine that you're buying bottles one at a time, then convert back to cases.

The first bottle is guaranteed to give you a team you didn't have before. 1 team requires 1 bottle.

Once you have one team, the next bottle has a 29/30 probability of giving you a team you don't already have, and a 1/30 probability of the one team you already have. Once you have one team, you will have to purchase, on average, 30/29 bottles (≈1.034) bottles to get your second team. Or, to put it another way, you would have to, on average, purchase 2.034 bottles to get two different teams. (That the average number of bottles you need to purchase to get a second team is the reciprocal of the probability of the next bottle having a new team may not be immediately obvious, but is left as an exercise for the reader.)

Once you have two teams, the next bottle has a 28/30 probability of having a team you don't already have, so it takes another 30/28 (≈1.071) bottles on average to get the third team, or a total of 3.106 (rounding) bottles on average to get three distinct teams.

Thus, to get 27 distinct teams, you would have to, on average, purchase 30/30 + 30/29 + 30/28 + 30/27 ... + 30/4 bottles. Or about 64.840 bottles. You've bought 84 bottles, so you're doing worse than average.

Given that you have 27 teams already, you would, on average, have to buy 30/3 + 30/2 + 30/1 = 55 more bottles to get the last three. 2 cases is 56 bottles.

Again, all this is based on the assumptions that all teams are equally likely and the team on any bottle is independent of the team on any other bottle, which may or may not be the case in practice.
posted by DevilsAdvocate at 10:48 AM on April 9, 2010 [2 favorites]


Actually it occurs to me that, assuming the sample in question is randomly generated and is large it doesn't matter what the underlying population is like. We can assume that the sample is large but we can't know if it was randomly generated.
posted by dfriedman at 10:48 AM on April 9, 2010


So what are the chances you'll get all three of the missing ones in your next case? I think it's just .95 cubed, which is .85 roughly

No, it's not. That would be the probability that if he bought three cases, each one of the cases would contain at least one of the three teams not among his first 27. And even then, it doesn't guarantee that those three bottles are three different teams, just that they're not one of the first 27—you've also included the probability that two or all three are of the same team.
posted by DevilsAdvocate at 10:53 AM on April 9, 2010 [2 favorites]


or, just buy the set
posted by HuronBob at 10:53 AM on April 9, 2010 [2 favorites]


The relevant bit of statistics here is the central limit theorem http://en.m.wikipedia.org/wiki/Central_limit_theorem?wasRedirected=true
posted by dfriedman at 10:54 AM on April 9, 2010


So what are the chances you'll get all three of the missing ones in your next case? I think it's just .95 cubed, which is .85 roughly. So there's an 85% chance you'll get all 3 of the missing ones in your next case.

How about if you buy 2 more cases? Now there are 56 new bottles instead of 28. Going through the math again with that number of bottles, it looks like about a 99% chance you'll get the 3 missing ones in the next 2 cases.


I don't remember the real way to calculate this, but I think these calculations are incorrect. The calculation for figuring out the odds of one of the three missing caps being found (regardless of whether the others are or not), is just 1 - (29/30)^X. As I said above, there is a 40% of not finding a specific one in a case, which corresponds to a 60% chance of finding one. If that's true, your 85% chance can't be right, because the outcomes where all three are found is just a (relatively small) subset of the outcomes where one particular one is found, so it has to be significantly less than 60%. The 2 case calculation seems to be off for the same reason, because I get a 85% chance of finding one specific cap in two cases.
posted by burnmp3s at 10:57 AM on April 9, 2010


Response by poster: Right, I should've added that we'd need to assume they are printing equal numbers of each team and they are distributed randomly. Rather than, say McDonald's doing a Monoploy promotion and having a few "properties" in extremely limited supply.

My missing teams are Buffalo, Washington and Columbus. I would've thought if the folks at Bud wanted to drive me crazy (not sure why they would -- it's not like you win anything for completing the set and how many dinks like me can there be who would devote themselves to the task) they'd pick more blue-chip teams. But obviously my sample is small.
posted by raider at 10:58 AM on April 9, 2010


Er, assuming equi-probable teams, this is just the Coupon collector's problem, no?
posted by PMdixon at 11:14 AM on April 9, 2010


Response by poster: HuronBob, think I have 99 cents to piss away on stupid beer caps? You don't get beer with those.
posted by raider at 11:14 AM on April 9, 2010 [1 favorite]


Finding the probability that you'll get the last three teams within N bottles (which you didn't specifically ask about, but may also be interested in, and others have brought up), is a different and somewhat more difficult question than the average number of bottles you need to get the last three teams (which I've answered above).

A general outline of how to calculate that is: for each k from 3 to N, calculate the probability that exactly k bottles have one of the final 3 (F3) teams on them. Then, for each k, calculate the probability that, having k bottles with the F3 teams on them, each of the F3 is represented at least once. Multiply those two probabilities for each k, then add them up.

Or, to put it another way,

Probability of finding all 3 F3 teams among N bottles =
(probability of exactly 3 out of N bottles having one of the F3 teams)*(probability that, given exactly 3 bottles with F3 teams, each team is represented once)
+ (probability of exactly 4 out of N bottles having one of the F3 teams)*(probability that, given exactly 4 bottles with F3 teams, each team is represented at least once)
+ (probability of exactly 5 out of N bottles having one of the F3 teams)*(probability that, given exactly 5 bottles with F3 teams, each team is represented at least once)
+ ...

The probability that exactly k bottles out of N are labeled with one of the F3 teams is:

(27/30)(N-k) * (3/30)k * (N!/k!(N-k)!)

The probability that, given exactly k bottles with one of the F3 teams, all three teams will be represented at least once, is:

1 - [3*(2/3)k - 3*(1/3)k]

Running this through Excel, I find the probability of getting all three of the last three teams to be:

≈0.2212 with 28 bottles (1 case)
≈0.6109 with 56 bottles (2 cases)
≈0.8351 with 84 bottles (3 cases)
≈0.9340 with 112 bottles (4 cases)
≈0.9741 with 140 bottles (5 cases)
≈0.9899 with 168 bottles (6 cases)
≈0.9961 with 196 bottles (7 cases)
posted by DevilsAdvocate at 11:33 AM on April 9, 2010 [4 favorites]


No, it's not. That would be the probability that if he bought three cases, each one of the cases would contain at least one of the three teams not among his first 27. And even then, it doesn't guarantee that those three bottles are three different teams, just that they're not one of the first 27—you've also included the probability that two or all three are of the same team.

Ugh, yeah, my response is pretty much entirely wrong. Please disregard it.
posted by FishBike at 11:46 AM on April 9, 2010


Er, assuming equi-probable teams, this is just the Coupon collector's problem, no?

Yes, but what fun is looking it up on Wikipedia when you can derive it yourself?
posted by DevilsAdvocate at 11:52 AM on April 9, 2010


Response by poster: OK, let's get to the bottom this and answer the definitive question: Is Labatt evil (they brew Bud Light in the GWN) or am I just a loser?

Off to the beer store!
posted by raider at 12:49 PM on April 9, 2010


Best answer: Another way to look at it is to ask, if you buy cases one by one, looking through the bottles in each case after you buy it, then if you don't have a complete set to go and buy another, what is the probability that you'll buy N cases, and what is the average number of cases you'll buy?

If you already have 27 teams, the possible outcomes of the next case and probabilities (approximate) are:
* no new teams: 0.0523
* one new team: 0.2777
* two new teams: 0.4488
* three new teams: 0.2212

With 28 teams:
* no new teams: 0.1449
* one new team: 0.4843
* two new teams: 0.3708

With 29 teams:
* no new teams: 0.3870
* one new team: 0.6130

The probabilities that it will take N cases (buying and checking one case at a time) to get to 30 teams is:
1 0.2212
2 0.3896
3 0.2242
4 0.0989
5 0.0401
6 0.0158
7 0.0062
8 0.0024
9 0.0009
10 0.0004
11 0.0001

And the average number of cases you'll need to buy is 2.441.
posted by DevilsAdvocate at 2:20 PM on April 9, 2010 [1 favorite]


Response by poster: OK, I got two more cases. Opened one, got Columbus! Two to go.

Interestingly, one cap was a plain BL cap. Maybe this is my Golden Ticket to the Stanley Cup Finals. Or just Labatt wants to give me an aneurism.

I will enjoy one of these then open the second case to offer DevilsAdvocate some more time to astound us...
posted by raider at 3:27 PM on April 9, 2010


A plain BL cap! No, nooooo, that throws all my calculations off! How likely are those??! AAUUUGGHHH!
posted by DevilsAdvocate at 4:03 PM on April 9, 2010


Response by poster: Victory!! Opened second case (5th overall) and got Washington and Buffalo. Whew.

DevilsAdvocate, I've been loving your breakdowns but out of curiosity, what are the odds if starting from from scratch rather than partway through the experiment?
posted by raider at 4:04 PM on April 9, 2010


Starting from scratch, the one-bottle-at-a-time model, per my first comment in this thread, requires 30/30 + 30/29 + 30/28 + 30/27 ... + 30/2 + 30/1 ≈ 119.840 bottles, on average. (Assuming each team is equally likely, each bottle is independent of the others, and NO FRIGGIN' PLAIN BUD LIGHT CAPS OR OTHER CAPS WITHOUT A TEAM OR CAPS WITH TWO OR MORE TEAMS OR BOTTLES WITHOUT CAPS OR ANY OTHER WEIRD SHIT.)

The one-case-at-a-time model gets pretty complex. As I'm off to have a few beers myself, I'll defer for now, but I might take a shot at it later this weekend if no one else gets to it first. Might end up being easier doing a Monte Carlo simulation than calculating it exactly.

Congratulations on completing the set!
posted by DevilsAdvocate at 4:21 PM on April 9, 2010


Er, 119.850, not 119.840. And replace "64.840" with "64.850" in my first comment in the thread.
posted by DevilsAdvocate at 4:24 PM on April 9, 2010


Response by poster: Hey DA, can I offer you a Bud Light?
posted by raider at 4:29 PM on April 9, 2010


Best answer: Turns out it's not as hard as I initially thought. The key is to begin with a bottle-by-bottle model, just that now you have to caculate the probability for each possible number of distinct teams for each given number of bottles, which can be done recursively. For example, the probability that, given 40 bottles, you have 23 distinct teams—call this P(40,23)—is the probability that there were 23 distinct teams among the first 39 bottles, and the 40th bottle is not a new team, plus the probability that there were 22 distinct teams among the first 39 bottles and the 40th bottle is a new team, i.e.,

P(40,23)=P(39,23)*(23/30)+P(39,22)*(8/30)

which is not too hard to do with Excel. Then, for each number of bottles, the corresponding number of cases is the number of bottles divided by 28, rounded up to the nearest integer. E.g., if all 30 teams are collected in 145 bottles, that's six cases. Just add up the probabilities corresponding to the number of bottles needed from 141 to 168, and that gives you the total probability that the set will be completed with the sixth case.

Running all that through Excel gives me the following probabilities:

2 cases: 0.0023
3 cases: 0.1366
4 cases: 0.3520
5 cases: 0.2748
6 cases: 0.1374
7 cases: 0.0584
8 cases: 0.0234
9 cases: 0.0092
10 cases: 0.0036
11 cases: 0.0014
12 cases: 0.0005
13 cases: 0.0002
14 cases: 0.0001

And the average number of cases required is 4.7621; the median number is 5, so you were neither particular lucky nor unlucky.

You can buy me a Bud Light if we're ever at a meetup together.
posted by DevilsAdvocate at 5:45 PM on April 9, 2010 [1 favorite]


« Older I'm looking for a talk given by two sociologists...   |   Turn my messenger bag into a backpack. QUICK Newer »
This thread is closed to new comments.