Card deck probability distribution with reshuffles?
February 3, 2019 1:23 PM   Subscribe

What is the probability distribution (??) of card deals in a deck if the deck is reshuffled before the end is reached?

I have been googling around trying to get help on this question but I don't think I am using the right terminology. Suppose you have a standard deck of playing cards, shuffled, and you flip one card after another, noting the number of the card that is drawn (like in a histogram). When you draw an Ace, the deck is reshuffled. After thousands of draws, all of the cards will have roughly the same number of occurrences except for Ace (or whatever cards are used as the shuffle trigger), which will have the highest. I can prove this by simulation, but I don't know how to write a calculation to come up with the expected number of times that Ace will be drawn.

This is not for homework, it's for a board game.
posted by muddgirl to Science & Nature (10 answers total) 6 users marked this as a favorite
 
If drawing an ace is a trigger to reshuffle the whole deck, wouldn't the number of aces drawn be equal to the number of times you reshuffle? Or are you looking for the probability distribution for the other (non-ace) cards?
posted by Paragon at 2:21 PM on February 3, 2019 [1 favorite]


Best answer: On average, you will draw a certain number of cards (call it N) before you draw an ace. Since you reshuffle after you draw an ace each time, this means that in the long run, one out of every N cards is an ace. So we need to figure out what N is. Let's go through a few cases:
  • The probability of drawing just one card is 4/52: the first card in the deck is one of the four aces out of the 52 cards.
  • The probability of drawing two cards is (48/52)*(4/51): the first card is one of the 48 non-aces out of 52 cards, and then the second one is one of the aces out of the remaining 51 cards.
  • The probability of drawing three cards is (48/52)*(47/51)*(4/50): the first card is one of the 48 non-aces out of 52 cards, the second one is one of the 47 remaining non-aces out of the remaining 51 cards, and the third is an ace out of the remaining 50 cards.
Continuing in this way, the probability of drawing i cards works out to be p(i) = 48! * (52 - i)! * 4 / [(48 - i + 1)! * 52!]. The average number of cards drawn is then N = the sum from i = 1 to 49 of i * p(i). Whacking this in to Wolfram Alpha yields an answer of N = 53/5, which means that in the long run, one card out of 5/53 ≈ 9.4339...% will be an ace. The remaining twelve ranks will, on average, come up one time out of 4/53 ≈ 7.5472...%.

Note that the fraction of cards that are aces, in the long run, is just equal to (number of aces + 1)/(number of cards in the deck + 1). Running the same calculation for n "aces" out of a deck of m cards in Wolfram Alpha confirms this. The fact that this answer is so simple probably means that there's an easier way of deriving it; I look forward to seeing what other Mefites come up with to solve this problem.

My kingdom for a MathJax...
posted by Johnny Assay at 2:22 PM on February 3, 2019 [8 favorites]


I considered the case of a two-card deck (say, an ace and a deuce) and came to the conclusion that 2/3 of the time the next card would be an ace. That jibes with Johnny Assay's formula (aces+1)/(decksize+1).
posted by the antecedent of that pronoun at 2:31 PM on February 3, 2019 [1 favorite]


Best answer: If the experiment is defined as "From a shuffled deck, draw cards without replacement until an ace is drawn", then the number of aces drawn will always be 1. However, if you want the expected number of cards drawn in the experiment, then we can approach this as a "linearity of expectation" problem. Brief explanation: if a random variable X can be written as the sum of two random variables A + B, then the expected value of X, E(X), is equal to the sum of the expected values of A and B, E(A) + E(B).

In this problem, we can define a random variable X_i for each non-ace card, where X_i = 1 if it is drawn before the first ace, and 0 if it is not drawn. Now, if we define the random variable X to be the number of cards drawn in the experiment, then X is equal to 1 (for the ace) plus the number of non-ace cards drawn before the first ace. Very conveniently, the number of non-ace cards drawn before the first ace is exactly the sum of the random variables X_i, because if a card is drawn, its random variable X_i contributes 1 to the sum, and if it is not drawn, it contributes 0 to the sum.

Also conveniently, the expected value of X_i is easy to calculate: 1 times the probability that the card is drawn before the first non-ace, plus 0 times the probability that the card is not drawn. In other words, E(X_i) is the probability that the card is drawn before the first non-ace. (Notice that this is the same for all non-ace cards, regardless of how they end up ordered in the shuffle.) Given one non-ace and four aces shuffled somewhere in the deck, the probability that the non-ace appears first is simply 1/5. Therefore, E(X_i) = 1/5, and the expected number of non-ace cards drawn before the first ace is 48/5. Therefore, the expected number of cards drawn in the experiment is 1 + 48/5 = 53/5.
posted by J.K. Seazer at 2:46 PM on February 3, 2019 [5 favorites]


Imagine a hypothetical infinite deck. This deck is nominally random, but any 52 card section you remove from it contains all 52 cards.

When you draw an ace, it is tantamount to having an ace in the infinite deck, and a full four-ace 52-card deck behind it. Thus, in 53 cards, there are 5 aces, yielding Johnny Assay's formula above.
posted by Maecenas at 2:47 PM on February 3, 2019 [3 favorites]


Also conveniently, the expected value of X_i is easy to calculate: 1 times the probability that the card is drawn before the first non-ace, plus 0 times the probability that the card is not drawn. In other words, E(X_i) is the probability that the card is drawn before the first non-ace.

Oops, that should have been "ace", of course, not "non-ace" >_<
posted by J.K. Seazer at 2:54 PM on February 3, 2019


Johnny's answer looks good, but just to make sure I coded up a quick simulation to draw 10,000,000 cards from a deck following these rules. There were 944,043 aces and between 753,861 and 755,747 of each other card. These numbers are well within expected random variation of his result, so there's empirical evidence.
posted by jackbishop at 7:14 AM on February 4, 2019


Response by poster: Hm, I marked some best answers but my simulation doesn't match - for a poker deck I am getting about 9% aces instead of 7% and for my specific case of 2 shuffle cards in a deck of 20, I'm getting 0.15 instead of 1/7, so I'm glad I checked numerically.
posted by muddgirl at 8:40 AM on February 4, 2019


Hm, I marked some best answers but my simulation doesn't match - for a poker deck I am getting about 9% aces instead of 7% and for my specific case of 2 shuffle cards in a deck of 20, I'm getting 0.15 instead of 1/7, so I'm glad I checked numerically.

I'd say those numbers look like confirmation of the answers above: 5/53 ≈ 9.4% and 3/21 ≈ 14.3% (which isn't exactly 15%, but the difference between them is likely just random variation unless you did a huge number of trials -- back of envelope says 30,000+ draws to reach statistical significance).

I agree with J. K. Seazer's answer but I thought a gloss might be helpful. If you name 5 cards, then shuffle a deck, then the 5 named cards have an equal chance of appearing in any order in the deck. This is just by symmetry -- there are an equal number of ways of shuffling the deck that accomplish each of the possible orderings of the 5 cards. If you apply this reasoning to the four aces and any given fifth card, that fifth card is equally likely to be first, second, third, fourth, or fifth among the five. So the four aces divide the deck into five parts, and each non-ace card is equally likely to be in any of the five parts. That means that, on average, the first part of the deck -- the cards above the topmost ace -- will contain 48/5 = 9.6 cards. Include the ace and you'll be drawing 53/5 = 10.6 cards on average before reshuffling.

If you want the full probability distribution (the probability of drawing each possible number of cards before the first ace) as opposed to just the expected value (average), that is given by the binomial distribution with N = 48 trials and success rate p = 1/5.
posted by aws17576 at 9:02 AM on February 4, 2019 [1 favorite]


Response by poster: I was simulating a million draws; thankfully the bug was in my deck building so I actually simulated a deck with 19 cards: (2+1)/(19+1) = 0.15. Thanks all!
posted by muddgirl at 9:06 AM on February 4, 2019


« Older What happened to my taxes?   |   Adjusting to major metro area after life in a... Newer »
This thread is closed to new comments.