# Probability and a rather unique deck of cards.

November 2, 2013 9:44 PM Subscribe

Each card in a certain deck has three letters on it.
The first letter is either A, B, or C.
The second letter is either D, E, F, or G.
The third letter is either H, I, J, K, or L.
Every possible combination is represented exactly once in the deck. Ergo, there are 3x4x5=60 cards in the deck.
How can I determine the probability that a hand of X cards, drawn randomly from the deck, will include at least one of each of the letters?

You have to count permutations of x cards that have all the letters out of the set of all combinations of x cards. Might want to try the minimum case of x=5 first.

posted by empath at 10:10 PM on November 2, 2013

posted by empath at 10:10 PM on November 2, 2013

Well, since the third letter group has 5 letters, you need X to be at least 5 in order for there to have a non-zero probability of having one of each letter.

I don't remember all about probability, but here's how I would think about it, if x = 5. (Disclaimer: I might not have thought it through as carefully as I should... and the reasoning may be wrong.)

The probability that you have H, I, J, K, and L in your set of 5 cards is 60/60*48/59*36/58*24/57*12/56, since you have less probability of picking a previously unpicked letter, with each additional card.

The probability that you will get D, E, F, and G in your set is more complicated. You can have a duplicate in any position. I believe the probability for this is the following:

60/60*14/59*45/58*30/57*15/56

+ 60/60*45/59*28/58*30/57*15/56

+ 60/60*45/59*30/58*42/57*15/56

+ 60/60*45/59*30/58*15/57*56/56

My reasoning for this is that there are 4 different ways to pick a full set of 4 cards. Specifically, you are allowed one duplicate card. That card can be in any position but the first (the first can't be a duplicate because you haven't picked a card yet). So I calculate with the probability of each possible duplicate card, with the rest being unique.

The probability for A, B, and C being in the set is the same as above, but I believe it's even longer, because there are two possible duplicate cards, and they can be in any of 4 positions. So there are six terms to sum up. (Six terms, because the combination of 4 [positions] choose 2 [duplicate cards] is 6.)

Then, all you have to do is multiply together the three probabilities to calculate the probability of having a full set across all 12 letters.

posted by ethidda at 11:06 PM on November 2, 2013

I don't remember all about probability, but here's how I would think about it, if x = 5. (Disclaimer: I might not have thought it through as carefully as I should... and the reasoning may be wrong.)

The probability that you have H, I, J, K, and L in your set of 5 cards is 60/60*48/59*36/58*24/57*12/56, since you have less probability of picking a previously unpicked letter, with each additional card.

The probability that you will get D, E, F, and G in your set is more complicated. You can have a duplicate in any position. I believe the probability for this is the following:

60/60*14/59*45/58*30/57*15/56

+ 60/60*45/59*28/58*30/57*15/56

+ 60/60*45/59*30/58*42/57*15/56

+ 60/60*45/59*30/58*15/57*56/56

My reasoning for this is that there are 4 different ways to pick a full set of 4 cards. Specifically, you are allowed one duplicate card. That card can be in any position but the first (the first can't be a duplicate because you haven't picked a card yet). So I calculate with the probability of each possible duplicate card, with the rest being unique.

The probability for A, B, and C being in the set is the same as above, but I believe it's even longer, because there are two possible duplicate cards, and they can be in any of 4 positions. So there are six terms to sum up. (Six terms, because the combination of 4 [positions] choose 2 [duplicate cards] is 6.)

Then, all you have to do is multiply together the three probabilities to calculate the probability of having a full set across all 12 letters.

posted by ethidda at 11:06 PM on November 2, 2013

This is equivalent to having three decks of 60 cards, the first labelled A, B, C, the second D, E, F, G, and the third H, I, J, K, L with even distributions (i.e. 20 each in the first, 15 each in the second, 12 each in the third). It doesn't matter that there is more than one letter on each card.

Imagining three decks, it is easy to calculate the probability of picking one of each letter from each deck in a hand of X cards. Then, multiply those probabilities together to find the probability of these three independent events happening concurrently.

posted by ssg at 11:09 PM on November 2, 2013

Imagining three decks, it is easy to calculate the probability of picking one of each letter from each deck in a hand of X cards. Then, multiply those probabilities together to find the probability of these three independent events happening concurrently.

posted by ssg at 11:09 PM on November 2, 2013

Let me work it out in writing, and you tell me how it sounds.

1.If you have 60 cards, then 20 will have A on them, 20 will have B, and 20 will have C.

2. 15 will have D, 15 will have E, 15 will have F, and 15 will have G.

3. 12 will have H, 12 will have I, 12 will have J, 12 will have K, and 12 will have L.

So. In order to ENSURE that you have all A, B, C, you would definitely need 41 cards.

In order to ENSURE that you have all D, E, F, G, you would have 46 cards.

In order to ENSURE that you have all H, I, J, K, L, you would have to have 49 cards.

Thats as far as I can get. I'm not 100% certain its the right track...but I do know that if you would want at least 49 cards to get H, I, J, K and L.

I think.

posted by hal_c_on at 11:10 PM on November 2, 2013

1.If you have 60 cards, then 20 will have A on them, 20 will have B, and 20 will have C.

2. 15 will have D, 15 will have E, 15 will have F, and 15 will have G.

3. 12 will have H, 12 will have I, 12 will have J, 12 will have K, and 12 will have L.

So. In order to ENSURE that you have all A, B, C, you would definitely need 41 cards.

In order to ENSURE that you have all D, E, F, G, you would have 46 cards.

In order to ENSURE that you have all H, I, J, K, L, you would have to have 49 cards.

Thats as far as I can get. I'm not 100% certain its the right track...but I do know that if you would want at least 49 cards to get H, I, J, K and L.

I think.

posted by hal_c_on at 11:10 PM on November 2, 2013

*How can I determine the probability that a hand of X cards, drawn randomly from the deck, will include at least one of each of the letters?*

So the lowest number of cards you can use to satisfy this would be 5. The highest would be 49.

So that means if you were drawing a random amount of cards from the deck (1-60), there would be a 55 out of 60 chance of you getting all the cards. Only with 1 to 4 picks are you guaranteed NOT to have all the letters.

As for what the chances of you having all letter if you pick up 20 cards is...I don't know.

posted by hal_c_on at 11:14 PM on November 2, 2013

*This is equivalent to having three decks of 60 cards, the first labelled A, B, C, the second D, E, F, G, and the third H, I, J, K, L with even distributions (i.e. 20 each in the first, 15 each in the second, 12 each in the third). It doesn't matter that there is more than one letter on each card.*

Imagining three decks, it is easy to calculate the probability of picking one of each letter from each deck in a hand of X cards. Then, multiply those probabilities together to find the probability of these three independent events happening concurrently.

Imagining three decks, it is easy to calculate the probability of picking one of each letter from each deck in a hand of X cards. Then, multiply those probabilities together to find the probability of these three independent events happening concurrently.

This isn't right--the events in the original problem are not independent. In your reformulation it's possible to pick ADH (for example) twice in a row. But in the original problem you can only get ADH once, and then the card is gone.

empath has the right approach. To expand on his answer a bit: to start with let's say we want to find the probability of getting all the letters in a draw of 5 cards. The probability is equal to (number of possible 5-card sets with all letters) / (number of possible 5-card sets, period). The denominator is easy, it's just C(60,5). The numerator is the tricky part--but the key to this approach is that you don't need to compute any conditional probabilities at all, you're just trying to *count* all the possible 5-card sets that have all the letters. Makes it a bit easier to think about.

posted by equalpants at 1:00 AM on November 3, 2013 [2 favorites]

ethidda's and ssg's answers, modeling the three sets of letters as independent, are incorrect. They're fine as far as they go in determining the probability of at least one each of A, B, and C being drawn; the probability of at least one each of D, E, F, and G being drawn; and the probability of one each of H, I, J, K, and L being drawn. But because they are not independent events, you cannot multiply those three probabilities together to get the overall probability.

An illustration of the flaw: suppose you draw two cards from the deck. What is the probability that a) the first letter matches; b) the second letter matches; c) the third letter matches, and d) all three letters match?

(a), (b), and (c) are all non-zero, obviously. If you could model the deck as three independent decks, you would have (d)=(a)*(b)*(c), also a non-zero number. But in the original deck the probability of drawing two cards on which all three letters match is zero; the deck is defined to have no two cards identical. The three letters are not independent, as drawing (ADH) precludes drawing (ADH) again on a later card, whereas if you model three independent decks, after drawing (A)(D)(H), it is possible to draw (A)(D)(H) again.

Another demonstration that it is incorrect to simply multiply the probabilities can be had by actually calculating the probabilities for the simplest nontrivial case of X=5. ethidda is correct as far as working out the initial probabilities, which work out to be:

Probability of drawing H, I, J, K, L: 29859840/655381440 = 10368/227563

Probability of drawing D, E, F, G: 170100000/655381440 = 16875/65018

Probability of drawing A, B, C: 424080000/655381440 = 7750/11977

If you multiply those three probabilities together, you get 1355940000000/177207992711918 = 677970000000/88603996355959. Here's the rub: there are C(60,5)=5461512 ways to draw five cards from a deck of 60 (not considering the order of the cards drawn), so the correct probability

For the case of X=5 it might just be easiest to have a computer look at all 5461512 combinations; it's not that many for a computer to handle. As X gets larger, the number of combinations rapidly grows; unless you need an exact answer, it might be easiest to just run a Monte Carlo simulation—have a computer run a few million trials—to get an approximate probability.

posted by DevilsAdvocate at 1:01 AM on November 3, 2013 [1 favorite]

An illustration of the flaw: suppose you draw two cards from the deck. What is the probability that a) the first letter matches; b) the second letter matches; c) the third letter matches, and d) all three letters match?

(a), (b), and (c) are all non-zero, obviously. If you could model the deck as three independent decks, you would have (d)=(a)*(b)*(c), also a non-zero number. But in the original deck the probability of drawing two cards on which all three letters match is zero; the deck is defined to have no two cards identical. The three letters are not independent, as drawing (ADH) precludes drawing (ADH) again on a later card, whereas if you model three independent decks, after drawing (A)(D)(H), it is possible to draw (A)(D)(H) again.

Another demonstration that it is incorrect to simply multiply the probabilities can be had by actually calculating the probabilities for the simplest nontrivial case of X=5. ethidda is correct as far as working out the initial probabilities, which work out to be:

Probability of drawing H, I, J, K, L: 29859840/655381440 = 10368/227563

Probability of drawing D, E, F, G: 170100000/655381440 = 16875/65018

Probability of drawing A, B, C: 424080000/655381440 = 7750/11977

If you multiply those three probabilities together, you get 1355940000000/177207992711918 = 677970000000/88603996355959. Here's the rub: there are C(60,5)=5461512 ways to draw five cards from a deck of 60 (not considering the order of the cards drawn), so the correct probability

**must**be expressible as an integer divided by 5461512—which that product is not.For the case of X=5 it might just be easiest to have a computer look at all 5461512 combinations; it's not that many for a computer to handle. As X gets larger, the number of combinations rapidly grows; unless you need an exact answer, it might be easiest to just run a Monte Carlo simulation—have a computer run a few million trials—to get an approximate probability.

posted by DevilsAdvocate at 1:01 AM on November 3, 2013 [1 favorite]

Draw one card from the deck. It will contain one letter from the first group, one from the second and one from the third.

40 out of the remaining 59 cards will have a different first-group letter from the first one drawn; 45 out of the 59 will have a different second-group letter; and 48 out of the 59 will have a different third-group letter. The probability of drawing a second card that shares no letters with the first is the product of these, or 40×45×48 ÷ 59

Assuming you've managed to do that: 20 out of the remaining 58 cards, or 10 in 29, will have a different first-group letter from the first two drawn; 30 out of the 58 or 15 in 29 will have a different second-group letter; and 36 out of the 58 or 18 in 29 will have a different third-group letter. The probability of drawing a third card that shares no letters with the first two is the product of all of these, or 40×45×48 × 10×15×18 ÷ (59

We now have complete coverage for the first group and can start concentrating only on the second and third.

15 out of the remaining 57 cards, or 5 out of every 19, will have a different second-group letter from any of the three already drawn. 24 out of 57, or 8 in 19, will have a different third-group letter. So the probability of drawing a set of four cards with complete coverage of the first and second groups and no duplicates in the third is 40×45×48 × 10×15×18 × 5×8 ÷ (59

Finally, 12 out of the remaining 56 cards, or 3 out of every 14, will have a different third-group letter from any of the four already drawn. So the probability of drawing five cards with complete letter coverage becomes 40×45×48 × 10×15×18 × 5×8 × 3 ÷ (59

I'll think about this a bit more before extending to X = 6+; intuition is telling me that there's a reasonably neat way to make it generalize but I'm not yet convinced I know what that is.

posted by flabdablet at 1:03 AM on November 3, 2013

40 out of the remaining 59 cards will have a different first-group letter from the first one drawn; 45 out of the 59 will have a different second-group letter; and 48 out of the 59 will have a different third-group letter. The probability of drawing a second card that shares no letters with the first is the product of these, or 40×45×48 ÷ 59

^{3}= 86400 / 205379.Assuming you've managed to do that: 20 out of the remaining 58 cards, or 10 in 29, will have a different first-group letter from the first two drawn; 30 out of the 58 or 15 in 29 will have a different second-group letter; and 36 out of the 58 or 18 in 29 will have a different third-group letter. The probability of drawing a third card that shares no letters with the first two is the product of all of these, or 40×45×48 × 10×15×18 ÷ (59

^{3}×29^{3}) = 233280000 / 5008988431.We now have complete coverage for the first group and can start concentrating only on the second and third.

15 out of the remaining 57 cards, or 5 out of every 19, will have a different second-group letter from any of the three already drawn. 24 out of 57, or 8 in 19, will have a different third-group letter. So the probability of drawing a set of four cards with complete coverage of the first and second groups and no duplicates in the third is 40×45×48 × 10×15×18 × 5×8 ÷ (59

^{3}×29^{3}×19^{2}) = 9331200000 / 1808244823591.Finally, 12 out of the remaining 56 cards, or 3 out of every 14, will have a different third-group letter from any of the four already drawn. So the probability of drawing five cards with complete letter coverage becomes 40×45×48 × 10×15×18 × 5×8 × 3 ÷ (59

^{3}×29^{3}×19^{2}×14) = 20×45×48 × 10×15×18 × 5×8 × 3 ÷ (59^{3}×29^{3}×19^{2}×7) = 13996800000 / 12657713765137 and that's your answer for X = 5.I'll think about this a bit more before extending to X = 6+; intuition is telling me that there's a reasonably neat way to make it generalize but I'm not yet convinced I know what that is.

posted by flabdablet at 1:03 AM on November 3, 2013

*40 out of the remaining 59 cards will have a different first-group letter from the first one drawn; 45 out of the 59 will have a different second-group letter; and 48 out of the 59 will have a different third-group letter. The probability of drawing a second card that shares no letters with the first is the product of these, or 40×45×48 ÷ 59*

^{3}= 86400 / 205379.You have succinctly illustrated why it is wrong to consider the three sets of letters independently. After you draw one card, there are 59 cards left in the deck. Some integral number of those 59 cards have no letters in common with the first card, so the probability of the second card having no letters in common with the first must be [integer]/59.

posted by DevilsAdvocate at 1:08 AM on November 3, 2013

equalpants and DevilsAdvocate are both quite correct in pointing out why my painstakingly worked out solution is bogus.

Back to the old drawing board.

posted by flabdablet at 1:11 AM on November 3, 2013

Back to the old drawing board.

posted by flabdablet at 1:11 AM on November 3, 2013

Wrote a program to check the X=5 case, for anyone who wants to validate their solution: code is here.

The probability for X=5 is 36000/5461512.

posted by equalpants at 1:44 AM on November 3, 2013 [1 favorite]

The probability for X=5 is 36000/5461512.

posted by equalpants at 1:44 AM on November 3, 2013 [1 favorite]

I do software. Math is hard, probability is harder, but software is something I can do.

I whipped up a simulator. It ran three million hands for each value of "X" in the question (three million hands of five cards, three million hands of six cards...) I stopped it around 40 because the results stopped being interesting, and started taking forever to calculate.

So I can't give you formulas or speak intelligently about how the probability behind it all works, but here are some numbers:

For hands of 5, percentage is 0.6601

For hands of 6, percentage is 4.029167

For hands of 7, percentage is 11.4972

For hands of 8, percentage is 22.4702

For hands of 9, percentage is 35.14347

For hands of 10, percentage is 47.66656

For hands of 11, percentage is 59.0409

For hands of 12, percentage is 68.58963

For hands of 13, percentage is 76.4663

For hands of 14, percentage is 82.58344

For hands of 15, percentage is 87.258

For hands of 16, percentage is 90.77547

For hands of 17, percentage is 93.37683

For hands of 18, percentage is 95.29597

For hands of 19, percentage is 96.67873

For hands of 20, percentage is 97.71747

For hands of 21, percentage is 98.421

For hands of 22, percentage is 98.9152

For hands of 23, percentage is 99.26524

For hands of 24, percentage is 99.5066

For hands of 25, percentage is 99.67226

For hands of 26, percentage is 99.7891

For hands of 27, percentage is 99.8626

For hands of 28, percentage is 99.91354

For hands of 29, percentage is 99.94773

For hands of 30, percentage is 99.96767

For hands of 31, percentage is 99.98037

For hands of 32, percentage is 99.98824

For hands of 33, percentage is 99.99377

For hands of 34, percentage is 99.99696

For hands of 35, percentage is 99.99844

For hands of 36, percentage is 99.99897

For hands of 37, percentage is 99.99953

For hands of 38, percentage is 99.99973

For hands of 39, percentage is 99.99987

For hands of 40, percentage is 99.99997

For hands of 41, percentage is 99.99997

As with all code, this is not guaranteed to be correct :)

I can figure out a place to throw the code up tomorrow if anyone is interested.

posted by Dilligas at 1:28 AM on November 3, 2013 [2 favorites]

I whipped up a simulator. It ran three million hands for each value of "X" in the question (three million hands of five cards, three million hands of six cards...) I stopped it around 40 because the results stopped being interesting, and started taking forever to calculate.

So I can't give you formulas or speak intelligently about how the probability behind it all works, but here are some numbers:

For hands of 5, percentage is 0.6601

For hands of 6, percentage is 4.029167

For hands of 7, percentage is 11.4972

For hands of 8, percentage is 22.4702

For hands of 9, percentage is 35.14347

For hands of 10, percentage is 47.66656

For hands of 11, percentage is 59.0409

For hands of 12, percentage is 68.58963

For hands of 13, percentage is 76.4663

For hands of 14, percentage is 82.58344

For hands of 15, percentage is 87.258

For hands of 16, percentage is 90.77547

For hands of 17, percentage is 93.37683

For hands of 18, percentage is 95.29597

For hands of 19, percentage is 96.67873

For hands of 20, percentage is 97.71747

For hands of 21, percentage is 98.421

For hands of 22, percentage is 98.9152

For hands of 23, percentage is 99.26524

For hands of 24, percentage is 99.5066

For hands of 25, percentage is 99.67226

For hands of 26, percentage is 99.7891

For hands of 27, percentage is 99.8626

For hands of 28, percentage is 99.91354

For hands of 29, percentage is 99.94773

For hands of 30, percentage is 99.96767

For hands of 31, percentage is 99.98037

For hands of 32, percentage is 99.98824

For hands of 33, percentage is 99.99377

For hands of 34, percentage is 99.99696

For hands of 35, percentage is 99.99844

For hands of 36, percentage is 99.99897

For hands of 37, percentage is 99.99953

For hands of 38, percentage is 99.99973

For hands of 39, percentage is 99.99987

For hands of 40, percentage is 99.99997

For hands of 41, percentage is 99.99997

As with all code, this is not guaranteed to be correct :)

I can figure out a place to throw the code up tomorrow if anyone is interested.

posted by Dilligas at 1:28 AM on November 3, 2013 [2 favorites]

Shouldn't the hand for 41 be 100% ?

I'd like to see code. This was an awesome way to go through it.

posted by hal_c_on at 2:18 AM on November 3, 2013

I'd like to see code. This was an awesome way to go through it.

posted by hal_c_on at 2:18 AM on November 3, 2013

Best answer: (tl;dr: scroll down to the answer in bold near the end of this comment)

There are C(60,X) ways to draw X cards from a deck of 60, not counting the order in which the cards are drawn. The trick will be to count all of the ways of drawing X cards where one or more of the letters is missing, and subtract that from C(60,X) to get the number in which all twelve letters are present.

It looks simple at first:

Number of ways to draw X cards with "A" missing = C(40,X)

Number of ways to draw X cards with "B" missing = C(40,X)

Number of ways to draw X cards with "C" missing = C(40,X)

Number of ways to draw X cards with "D" missing = C(45,X)

Number of ways to draw X cards with "E" missing = C(45,X)

Number of ways to draw X cards with "F" missing = C(45,X)

Number of ways to draw X cards with "G" missing = C(45,X)

Number of ways to draw X cards with "H" missing = C(48,X)

Number of ways to draw X cards with "I" missing = C(48,X)

Number of ways to draw X cards with "J" missing = C(48,X)

Number of ways to draw X cards with "K" missing = C(48,X)

Number of ways to draw X cards with "L" missing = C(48,X)

So we might think at this point we can just add all of these up and get the number of ways to draw X cards with a letter missing: 3*C(40,X)+4*C(45,X)+5*C(48,X)

But!! We've counted some ways of drawing X cards more than once. Some sets might be missing "C" and "K", and we counted those sets twice. Some sets might be missing "B", "F", "G", and "L," and we've counted those four times! Still, let's call that sum the "naive" count of ways to draw X cards with 1 letter missing, because we'll refer back to it.

To correct our naive count, let's subtract all the ways we can draw X cards with

with (AB), (AC), or (BC) missing: C(20,X) each

with (AD), (AE), (AF), (AG), (BD), (BE), (BF), (BG), (CD), (CE), (CF), or (CG) missing: C(30,X) each

with (AH), (AI), (AJ), (AK), (AL), (BH), (BI), (BJ), (BK), (BL), (CH), (CI), (CJ), (CK), or (CL) missing: C(32,X) each

with (DE), (DF), (DG), (EF), (EG), or (FG) missing: C(30,X) each

with (DH), (DI), (DJ), (DK), (DL), (EH), (EI), (EJ), (EK), (EL), (FH), (FI), (FJ), (FK), (FL), (GH), (GI), (GJ), (GK), or (GL) missing: C(36,X) each

with (HI), (HJ), (HK), (HL), (IJ), (IK), (IL), (JK), (JL), or (KL) missing: C(36,X) each

So, we (naively) get 3*C(20,X)+18*C(30,X)+15*C(32,X)+30*C(30,X) ways to draw X cards with two letters missing, which we double-counted in the naive count of ways to draw X cards with one letter missing.

So, we subtract that and we're done, right? Nope, because we're still messed up on ways of drawing X cards with three or more letters missing. For a set of cards missing ABE, for example, we counted it three times (missing A, missing B, missing E) in the naive count of one letter missing, then subtracted it out three times (missing AB, missing AE, missing BE) in the naive count with two letters missing. So we have to add those back in.

It turns out that each subsequent sum of all the naive counts up to a given number of letters missing alternately over- or undercounts the number for the next greater number of letters missing by one. We can stop the process once we get to six letters missing (which also happens to be six letters accounted for) because we can show that five cards must have at least six different letters among them.

That is, for X≥5, the number of ways to draw X cards with one or more letters missing is:

naive count of ways with one letter missing

minus naive count of ways with two letters missing

plus naive count of ways with three letters missing

minus naive count of ways with four letters missing

plus naive count of ways with five letters missing

minus naive count of ways with six letters missing.

Then all that has to be subtracted from C(60,X) to get the number of ways to draw X cards with all letters appearing at least once. Divide by C(60,X) to get the probability.

Do all the math, and that turns out to be:

As partial verification, the numerator is 36000 for X=5

Note: the formula as written is valid for X≥5 only, because I left off at terms for six letters. It's possible to extend it with terms for C(4,X) to C(0,X) to make it work out to the zero it should be for X≤4, but since we know the probability is zero in those cases I didn't bother.

On preview: hal_c_on, it's possible to draw as many as 48 cards while still missing a letter (one of H, I, J, K, or L).

posted by DevilsAdvocate at 2:33 AM on November 3, 2013 [2 favorites]

There are C(60,X) ways to draw X cards from a deck of 60, not counting the order in which the cards are drawn. The trick will be to count all of the ways of drawing X cards where one or more of the letters is missing, and subtract that from C(60,X) to get the number in which all twelve letters are present.

It looks simple at first:

Number of ways to draw X cards with "A" missing = C(40,X)

Number of ways to draw X cards with "B" missing = C(40,X)

Number of ways to draw X cards with "C" missing = C(40,X)

Number of ways to draw X cards with "D" missing = C(45,X)

Number of ways to draw X cards with "E" missing = C(45,X)

Number of ways to draw X cards with "F" missing = C(45,X)

Number of ways to draw X cards with "G" missing = C(45,X)

Number of ways to draw X cards with "H" missing = C(48,X)

Number of ways to draw X cards with "I" missing = C(48,X)

Number of ways to draw X cards with "J" missing = C(48,X)

Number of ways to draw X cards with "K" missing = C(48,X)

Number of ways to draw X cards with "L" missing = C(48,X)

So we might think at this point we can just add all of these up and get the number of ways to draw X cards with a letter missing: 3*C(40,X)+4*C(45,X)+5*C(48,X)

But!! We've counted some ways of drawing X cards more than once. Some sets might be missing "C" and "K", and we counted those sets twice. Some sets might be missing "B", "F", "G", and "L," and we've counted those four times! Still, let's call that sum the "naive" count of ways to draw X cards with 1 letter missing, because we'll refer back to it.

To correct our naive count, let's subtract all the ways we can draw X cards with

*two*letters missing:with (AB), (AC), or (BC) missing: C(20,X) each

with (AD), (AE), (AF), (AG), (BD), (BE), (BF), (BG), (CD), (CE), (CF), or (CG) missing: C(30,X) each

with (AH), (AI), (AJ), (AK), (AL), (BH), (BI), (BJ), (BK), (BL), (CH), (CI), (CJ), (CK), or (CL) missing: C(32,X) each

with (DE), (DF), (DG), (EF), (EG), or (FG) missing: C(30,X) each

with (DH), (DI), (DJ), (DK), (DL), (EH), (EI), (EJ), (EK), (EL), (FH), (FI), (FJ), (FK), (FL), (GH), (GI), (GJ), (GK), or (GL) missing: C(36,X) each

with (HI), (HJ), (HK), (HL), (IJ), (IK), (IL), (JK), (JL), or (KL) missing: C(36,X) each

So, we (naively) get 3*C(20,X)+18*C(30,X)+15*C(32,X)+30*C(30,X) ways to draw X cards with two letters missing, which we double-counted in the naive count of ways to draw X cards with one letter missing.

So, we subtract that and we're done, right? Nope, because we're still messed up on ways of drawing X cards with three or more letters missing. For a set of cards missing ABE, for example, we counted it three times (missing A, missing B, missing E) in the naive count of one letter missing, then subtracted it out three times (missing AB, missing AE, missing BE) in the naive count with two letters missing. So we have to add those back in.

It turns out that each subsequent sum of all the naive counts up to a given number of letters missing alternately over- or undercounts the number for the next greater number of letters missing by one. We can stop the process once we get to six letters missing (which also happens to be six letters accounted for) because we can show that five cards must have at least six different letters among them.

That is, for X≥5, the number of ways to draw X cards with one or more letters missing is:

naive count of ways with one letter missing

minus naive count of ways with two letters missing

plus naive count of ways with three letters missing

minus naive count of ways with four letters missing

plus naive count of ways with five letters missing

minus naive count of ways with six letters missing.

Then all that has to be subtracted from C(60,X) to get the number of ways to draw X cards with all letters appearing at least once. Divide by C(60,X) to get the probability.

Do all the math, and that turns out to be:

**[C(60,X) -5*C(48,X) -4*C(45,X) -3*C(40,X) +30*C(36,X) +15*C(32,X) +18*C(30,X) -40*C(27,X) -130*C(24,X) -15*C(20,X) +220*C(18,X) +105*C(16,X) -16*C(15,X) -245*C(12,X) +30*C(10,X) -180*C(9,X) -15*C(8,X) +550*C(6,X) -12*C(5,X)] / C(60,X)**As partial verification, the numerator is 36000 for X=5

**Important**: C(i,X) = (i!)/((X!)((i-X)!)) for i≥X, and**0 for i<X**.Note: the formula as written is valid for X≥5 only, because I left off at terms for six letters. It's possible to extend it with terms for C(4,X) to C(0,X) to make it work out to the zero it should be for X≤4, but since we know the probability is zero in those cases I didn't bother.

On preview: hal_c_on, it's possible to draw as many as 48 cards while still missing a letter (one of H, I, J, K, or L).

posted by DevilsAdvocate at 2:33 AM on November 3, 2013 [2 favorites]

Devils advocate.

Yeah, it would take the 49th card to ensure that you had one of each letter. I don't know what I was saying before.

Thanks.

posted by hal_c_on at 2:52 AM on November 3, 2013

Yeah, it would take the 49th card to ensure that you had one of each letter. I don't know what I was saying before.

Thanks.

posted by hal_c_on at 2:52 AM on November 3, 2013

Let's design a valid 5 card hand. There will be one each of an H card, an I card, a J card, a K card and an L card. How will we distribute the other letters across these cards? D can be on any of the 5, E on any of the remaining 4, F on any of the (now) remaining 3, leaving 2 spots for the G. On the fifth card, we have 4 choices of letter: D,E,F, or G and we care not which. Now we have to place the A, B, and C. A has 5 cards to choose from, B any of the remaining 4 and C any of the (now) remaining 3. Then we have a choice of any of the 3 (A, B, or C) in either of the 2 remaining cards, and, again, any of the 3 in the last.

How many valid hands did we design?

The one choice of HIJKL, then 5*4*3*2*4, then 5*4*3*6*3. That comes to 518400.

To calculate how probable this is, we then divide by 60 choose 5: 60*59*58*57*56/(5!) or 5461512. This gives about .095 which differs from the simulations above, so one of us must be wrong.

posted by Obscure Reference at 5:53 AM on November 3, 2013

How many valid hands did we design?

The one choice of HIJKL, then 5*4*3*2*4, then 5*4*3*6*3. That comes to 518400.

To calculate how probable this is, we then divide by 60 choose 5: 60*59*58*57*56/(5!) or 5461512. This gives about .095 which differs from the simulations above, so one of us must be wrong.

posted by Obscure Reference at 5:53 AM on November 3, 2013

This is a variant of the coupon collector's problem, by the way.

posted by escabeche at 5:59 AM on November 3, 2013

posted by escabeche at 5:59 AM on November 3, 2013

Two problems, Obscure Reference: a) you did not eliminate the possibility of duplicate cards; b) you counted some ways of assigning (D, E, F, G) and (A, B, C) multiple times. E.g., if you number the cards 1-5, you have counted (D4, E1, F5, G3, F2) and (D4, E1, F2, G3, F5) as different assignments.

posted by DevilsAdvocate at 6:32 AM on November 3, 2013

posted by DevilsAdvocate at 6:32 AM on November 3, 2013

Correction: only (b) is the issue. You've used each of H-L once, so there is no possibility of duplicates. My mistake.

posted by DevilsAdvocate at 6:50 AM on November 3, 2013

posted by DevilsAdvocate at 6:50 AM on November 3, 2013

I'd noticed I was counting some things multiple times and I'm working on release 2.0 (but I got a phone call). I still think it's the way to do this.

posted by Obscure Reference at 6:58 AM on November 3, 2013

posted by Obscure Reference at 6:58 AM on November 3, 2013

*It turns out that each subsequent sum of all the naive counts up to a given number of letters missing alternately over- or undercounts the number for the next greater number of letters missing by one. We can stop the process once we get to six letters missing (which also happens to be six letters accounted for) because we can show that five cards must have at least six different letters among them.*

Just wanted to note that in formal probability theory this is called the Inclusion-Exclusion Principle.

(sorry, new math student, excited to have something to contribute in these threads for once!)

posted by downing street memo at 7:10 AM on November 3, 2013 [1 favorite]

For the DEFG placement, we have 4 choices for which letter we duplicate. We place these in 5 choose 2 slots. The remaining 3 can be in any of 3! orders. A total of 60 choices.

The ABC placements are hared (and yet we say "easy as ABC--go figure). Either we duplicate 2 of the letters (a choice of 3 for which isn't duplicated) or we triplicate one (also a choice of 3).

In the first case, 5 choose 2 places for the first duplicated letter (that's 10) followed by three choose 2 (that's 3) for the second. The last letter is forced. This gives 3*10*3 for the first case.

There's 5 choose 3 (10) slots for the triplicated letter and 2 orders for the remaining 2 letters.

That's 3*10*2. This we add (since it's an 'or') to the 3*10*3 giving (60+90) 150.

So our numerator is now 60 * 150 which is 9000. 9000/5461512 which is about .0016. We're no coming in with a lower value than the simulation. What am I doing wrong?

posted by Obscure Reference at 7:18 AM on November 3, 2013

The ABC placements are hared (and yet we say "easy as ABC--go figure). Either we duplicate 2 of the letters (a choice of 3 for which isn't duplicated) or we triplicate one (also a choice of 3).

In the first case, 5 choose 2 places for the first duplicated letter (that's 10) followed by three choose 2 (that's 3) for the second. The last letter is forced. This gives 3*10*3 for the first case.

There's 5 choose 3 (10) slots for the triplicated letter and 2 orders for the remaining 2 letters.

That's 3*10*2. This we add (since it's an 'or') to the 3*10*3 giving (60+90) 150.

So our numerator is now 60 * 150 which is 9000. 9000/5461512 which is about .0016. We're no coming in with a lower value than the simulation. What am I doing wrong?

posted by Obscure Reference at 7:18 AM on November 3, 2013

No--I left out a factor of 4 which makes my answer the same as equalpants. And now matches the simulation.

posted by Obscure Reference at 7:20 AM on November 3, 2013 [1 favorite]

posted by Obscure Reference at 7:20 AM on November 3, 2013 [1 favorite]

Hmm. I wrote a Monte Carlo simulation as well, and got a different set of results. My sample size was 1 million, though. Here's the code: http://pastebin.com/w89uKNzJ

sampling size = 1000000

For a hand size of 5, probability = 0.553%

For a hand size of 6, probability = 3.232%

For a hand size of 7, probability = 9.113%

For a hand size of 8, probability = 17.704%

For a hand size of 9, probability = 28.025%

For a hand size of 10, probability = 38.632%

For a hand size of 11, probability = 48.780%

For a hand size of 12, probability = 57.953%

For a hand size of 13, probability = 65.885%

For a hand size of 14, probability = 72.400%

For a hand size of 15, probability = 77.956%

For a hand size of 16, probability = 82.368%

For a hand size of 17, probability = 86.004%

For a hand size of 18, probability = 88.872%

For a hand size of 19, probability = 91.138%

For a hand size of 20, probability = 93.001%

For a hand size of 21, probability = 94.441%

For a hand size of 22, probability = 95.579%

For a hand size of 23, probability = 96.490%

For a hand size of 24, probability = 97.254%

For a hand size of 25, probability = 97.797%

For a hand size of 26, probability = 98.260%

For a hand size of 27, probability = 98.624%

For a hand size of 28, probability = 98.904%

For a hand size of 29, probability = 99.119%

For a hand size of 30, probability = 99.305%

For a hand size of 31, probability = 99.461%

For a hand size of 32, probability = 99.569%

For a hand size of 33, probability = 99.657%

For a hand size of 34, probability = 99.722%

For a hand size of 35, probability = 99.781%

For a hand size of 36, probability = 99.828%

For a hand size of 37, probability = 99.861%

For a hand size of 38, probability = 99.887%

For a hand size of 39, probability = 99.908%

For a hand size of 40, probability = 99.927%

posted by suedehead at 9:52 AM on November 3, 2013

sampling size = 1000000

For a hand size of 5, probability = 0.553%

For a hand size of 6, probability = 3.232%

For a hand size of 7, probability = 9.113%

For a hand size of 8, probability = 17.704%

For a hand size of 9, probability = 28.025%

For a hand size of 10, probability = 38.632%

For a hand size of 11, probability = 48.780%

For a hand size of 12, probability = 57.953%

For a hand size of 13, probability = 65.885%

For a hand size of 14, probability = 72.400%

For a hand size of 15, probability = 77.956%

For a hand size of 16, probability = 82.368%

For a hand size of 17, probability = 86.004%

For a hand size of 18, probability = 88.872%

For a hand size of 19, probability = 91.138%

For a hand size of 20, probability = 93.001%

For a hand size of 21, probability = 94.441%

For a hand size of 22, probability = 95.579%

For a hand size of 23, probability = 96.490%

For a hand size of 24, probability = 97.254%

For a hand size of 25, probability = 97.797%

For a hand size of 26, probability = 98.260%

For a hand size of 27, probability = 98.624%

For a hand size of 28, probability = 98.904%

For a hand size of 29, probability = 99.119%

For a hand size of 30, probability = 99.305%

For a hand size of 31, probability = 99.461%

For a hand size of 32, probability = 99.569%

For a hand size of 33, probability = 99.657%

For a hand size of 34, probability = 99.722%

For a hand size of 35, probability = 99.781%

For a hand size of 36, probability = 99.828%

For a hand size of 37, probability = 99.861%

For a hand size of 38, probability = 99.887%

For a hand size of 39, probability = 99.908%

For a hand size of 40, probability = 99.927%

posted by suedehead at 9:52 AM on November 3, 2013

suedehead, it looks to me like you pick N cards from the deck, rather than picking N cards *which have not already been picked* from the deck. As a result, your hands contain duplicate cards.

(My Python is terribly weak though...)

posted by Dilligas at 10:07 AM on November 3, 2013

(My Python is terribly weak though...)

posted by Dilligas at 10:07 AM on November 3, 2013

Here's my code:

https://github.com/dilligas/CardProbability

posted by Dilligas at 10:17 AM on November 3, 2013

https://github.com/dilligas/CardProbability

posted by Dilligas at 10:17 AM on November 3, 2013

Here's my updated code:

http://pastebin.com/MMXgTYiG

I love python -- a single function, random.sample(population, k) to the rescue!

It looks like my new results are pretty similar to yours, Dilliagas - will post when it finishes running.

posted by suedehead at 12:11 PM on November 3, 2013

http://pastebin.com/MMXgTYiG

I love python -- a single function, random.sample(population, k) to the rescue!

It looks like my new results are pretty similar to yours, Dilliagas - will post when it finishes running.

posted by suedehead at 12:11 PM on November 3, 2013

New results -- which seem to match Dilligas's results. A larger sampling size would probably enhance the accuracy of the larger hand sizes.

sampling size = 1000000

For a hand size of 5, probability = 0.663%

For a hand size of 6, probability = 4.028%

For a hand size of 7, probability = 11.503%

For a hand size of 8, probability = 22.384%

For a hand size of 9, probability = 35.097%

For a hand size of 10, probability = 47.653%

For a hand size of 11, probability = 58.922%

For a hand size of 12, probability = 68.641%

For a hand size of 13, probability = 76.437%

For a hand size of 14, probability = 82.597%

For a hand size of 15, probability = 87.278%

For a hand size of 16, probability = 90.747%

For a hand size of 17, probability = 93.328%

For a hand size of 18, probability = 95.322%

For a hand size of 19, probability = 96.689%

For a hand size of 20, probability = 97.672%

For a hand size of 21, probability = 98.406%

For a hand size of 22, probability = 98.917%

For a hand size of 23, probability = 99.260%

For a hand size of 24, probability = 99.501%

For a hand size of 25, probability = 99.679%

For a hand size of 26, probability = 99.790%

For a hand size of 27, probability = 99.866%

For a hand size of 28, probability = 99.918%

For a hand size of 29, probability = 99.946%

For a hand size of 30, probability = 99.966%

For a hand size of 31, probability = 99.982%

For a hand size of 32, probability = 99.989%

For a hand size of 33, probability = 99.994%

For a hand size of 34, probability = 99.997%

For a hand size of 35, probability = 99.998%

For a hand size of 36, probability = 99.999%

For a hand size of 37, probability = 100.000%

For a hand size of 38, probability = 100.000%

For a hand size of 39, probability = 100.000%

For a hand size of 40, probability = 100.000%

posted by suedehead at 1:23 PM on November 3, 2013 [1 favorite]

sampling size = 1000000

For a hand size of 5, probability = 0.663%

For a hand size of 6, probability = 4.028%

For a hand size of 7, probability = 11.503%

For a hand size of 8, probability = 22.384%

For a hand size of 9, probability = 35.097%

For a hand size of 10, probability = 47.653%

For a hand size of 11, probability = 58.922%

For a hand size of 12, probability = 68.641%

For a hand size of 13, probability = 76.437%

For a hand size of 14, probability = 82.597%

For a hand size of 15, probability = 87.278%

For a hand size of 16, probability = 90.747%

For a hand size of 17, probability = 93.328%

For a hand size of 18, probability = 95.322%

For a hand size of 19, probability = 96.689%

For a hand size of 20, probability = 97.672%

For a hand size of 21, probability = 98.406%

For a hand size of 22, probability = 98.917%

For a hand size of 23, probability = 99.260%

For a hand size of 24, probability = 99.501%

For a hand size of 25, probability = 99.679%

For a hand size of 26, probability = 99.790%

For a hand size of 27, probability = 99.866%

For a hand size of 28, probability = 99.918%

For a hand size of 29, probability = 99.946%

For a hand size of 30, probability = 99.966%

For a hand size of 31, probability = 99.982%

For a hand size of 32, probability = 99.989%

For a hand size of 33, probability = 99.994%

For a hand size of 34, probability = 99.997%

For a hand size of 35, probability = 99.998%

For a hand size of 36, probability = 99.999%

For a hand size of 37, probability = 100.000%

For a hand size of 38, probability = 100.000%

For a hand size of 39, probability = 100.000%

For a hand size of 40, probability = 100.000%

posted by suedehead at 1:23 PM on November 3, 2013 [1 favorite]

This is an interesting (and much worse) variation on the Rook Problem.

Here's a really ugly brute force solution in Python (I'll let it run overnight). (I haven't written Python in a long time and it shows, sorry.)

posted by anaelith at 7:19 PM on November 3, 2013

Here's a really ugly brute force solution in Python (I'll let it run overnight). (I haven't written Python in a long time and it shows, sorry.)

import itertools; from operator import mul; from time import time; def numberOfGoodHands(symbolCounts, handSize): deck = list(itertools.product(*map(lambda x: range(x), symbolCounts))); hands = itertools.combinations(deck,handSize); goodHands = 0; for hand in hands: for position in range(len(symbolCounts)): if(len(set(map(lambda card: card[position], hand))) != symbolCounts[position]): break; else: goodHands += 1; return goodHands; def main(): symbolCounts = input("Symbol counts as a tuple, ex. (3,4,5): "); interestingHandSizes = range(max(symbolCounts),reduce(mul,symbolCounts)-(reduce(mul,symbolCounts)/max(symbolCounts))+1); for handSize in interestingHandSizes: start = time(); print "{:>2}: {:>50} time:{:g}s".format(handSize, numberOfGoodHands(symbolCounts, handSize), time() - start); main();

posted by anaelith at 7:19 PM on November 3, 2013

5: 36000 time:22.638s

6: 2016000 time:258.457s

7: 44400000 time:2384.31s

8: 574336200 time:17399.2s

36000/5461512 ~ 0.00659158123

2016000/50063860 ~ 0.040268569

44400000/386206920 ~ 0.114964279

574336200/2558620845 ~ 0.224471008

I ended up stopping this and got distracted with other stuff, but for posterity here are the first few probabilities. Memory use seemed to be reasonable, so you can modify it to get a single hand size and it should return you a result.... eventually.

posted by anaelith at 2:52 PM on November 10, 2013

6: 2016000 time:258.457s

7: 44400000 time:2384.31s

8: 574336200 time:17399.2s

36000/5461512 ~ 0.00659158123

2016000/50063860 ~ 0.040268569

44400000/386206920 ~ 0.114964279

574336200/2558620845 ~ 0.224471008

I ended up stopping this and got distracted with other stuff, but for posterity here are the first few probabilities. Memory use seemed to be reasonable, so you can modify it to get a single hand size and it should return you a result.... eventually.

posted by anaelith at 2:52 PM on November 10, 2013

This thread is closed to new comments.

posted by bruce at 10:07 PM on November 2, 2013 [1 favorite]