How many pairs in a deck drawn two cards at a time?
September 26, 2021 4:54 AM
I shuffle a deck of 52 regular playing cards. I draw two cards, then another two, then another two. I don't return any cards to the deck. If the two cards I draw match in value, they're a pair and I set them to one side. By the time I finish the deck, how many pairs can I expect to have?
The probability of person N having a pair is more complicated because after the first person, some cards have been dealt. So person 1 has 3/51 chance of a pair, but it's different for person 2 and so on. (I don't have a piece of paper to hand to work it out.)
posted by hoyland at 6:03 AM on September 26, 2021
posted by hoyland at 6:03 AM on September 26, 2021
hoyland: The probability that person 2 has a pair is different conditioned on whether person 1 has a pair, but the overall probability for person 2 is the same as for person 1 (by symmetry), and the beauty of linearity of expectation is that you don't have to care about whether the variables are independent.
posted by dfan at 6:06 AM on September 26, 2021
posted by dfan at 6:06 AM on September 26, 2021
At first, I made the same calculation as dfan, then I though maybe a hoyland-type calculation was better, and now I'm back on team dfan.
The complication envisioned by hoyland arises if you try to calculate the probabilities for each hand sequentially. Then the odds are affected by what has gone on before. But we don't need to do that. The symmetry argument prevails.
posted by SemiSalt at 7:54 AM on September 26, 2021
The complication envisioned by hoyland arises if you try to calculate the probabilities for each hand sequentially. Then the odds are affected by what has gone on before. But we don't need to do that. The symmetry argument prevails.
posted by SemiSalt at 7:54 AM on September 26, 2021
I agree with dfan. If "linearity of expectation" seems too good to be true, think of it this way: consider an arbitrary card in the deck, before you deal any cards. The chance that the card in the position to become its pair-partner has the same value is 3/51=1/17. This is true for every card, so when you deal 26 pairs, you'll on average get 26/17=1.53 pairs.
posted by zeptoweasel at 9:10 AM on September 26, 2021
posted by zeptoweasel at 9:10 AM on September 26, 2021
Or, to do this experimentally (via python):
0 20969
1 33215
2 26015
3 13199
4 4821
5 1355
6 348
7 63
8 11
9 4
The expected value is 1.
posted by miguelcervantes at 9:29 AM on September 26, 2021
import numpy as np
cards = np.arange(52)
hit_list = []
nsims = 100000
for ll in range(nsims):
draws = np.random.choice(cards,52,replace=False)
pairs = 0
for ii in range(26):
if (draws[ii*2] % 13) == (draws[ii*2+1]%13):
pairs +=1
hit_list.append(pairs)
hit_list = np.array(hit_list)
for ll in range(10):
print(ll,np.shape(np.where(hit_list == ll))[1])
0 20969
1 33215
2 26015
3 13199
4 4821
5 1355
6 348
7 63
8 11
9 4
The expected value is 1.
posted by miguelcervantes at 9:29 AM on September 26, 2021
I ran miguelcervantes' simulation and got an expected value of 1.53, as expected (pun intended).
posted by dfan at 9:41 AM on September 26, 2021
posted by dfan at 9:41 AM on September 26, 2021
Another experimental, in Perl:
A couple of runs (just shy of 19 seconds on my 2018 MacBook Pro) are giving me roughly 1.53.
posted by straw at 9:54 AM on September 26, 2021
perl -le 'use Data::Dumper; use List::Util qw/shuffle/; my $count = 1000000; my $dups = 0; for my $round (0..$count) { my @deck; for $suit ('spade', 'heart', 'club', 'diamond' ) { for my $num (1..13) { push @deck, [ $num, $suit]; } } @deck = shuffle(@deck); for (my $deal = 0; $deal < scalar(@deck); $deal += 2) { if ($deck[$deal]->[0] == $deck[$deal + 1]->[0]) { $dups += 1;} } }; print ($dups/$count);'
A couple of runs (just shy of 19 seconds on my 2018 MacBook Pro) are giving me roughly 1.53.
posted by straw at 9:54 AM on September 26, 2021
btw, if you want to do it hoyland's way, there are 5 ways for the third and fourth cards to be a pair:
(1/17 * 2/50 * 1/49) + (1/17 * 48/50 * 3/49) + (16/17 * 3/50 * 2/49) + (16/17 * 3/50 * 2/49) + (16/17 * 44/50 * 3/49).
multiplying out fractions:
(2 / 41650) + (144 / 41650) + (96 / 41650) + (96 / 41650) + (2112 / 41650),
adding numerators:
2450 / 41650,
and dividing by 2450:
1 / 17
it does end up being the same!
posted by panic at 5:25 PM on September 26, 2021
- the first two cards are a pair (1/17), and the third and fourth cards make four of a kind (2/50 * 1/49);
- the first two cards are a pair (1/17), and the third and fourth cards make a different pair (48/50 * 3/49);
- the first two cards aren't a pair (16/17), and the third and fourth cards make three of a kind with the first card (3/50 * 2/49);
- the first two cards aren't a pair (16/17), and the third and fourth cards make three of a kind with the second card (3/50 * 2/49); or
- the first two cards aren't a pair (16/17), and the third and fourth cards make a pair that doesn't involve either of the first two cards (44/50 * 3/49).
(1/17 * 2/50 * 1/49) + (1/17 * 48/50 * 3/49) + (16/17 * 3/50 * 2/49) + (16/17 * 3/50 * 2/49) + (16/17 * 44/50 * 3/49).
multiplying out fractions:
(2 / 41650) + (144 / 41650) + (96 / 41650) + (96 / 41650) + (2112 / 41650),
adding numerators:
2450 / 41650,
and dividing by 2450:
1 / 17
it does end up being the same!
posted by panic at 5:25 PM on September 26, 2021
$ raku -e 'my $x=10000;my $s; for ^$x {$s+=((((1..13) xx 4).flat.pick(2)) xx 26).map({$^a[0]==$^a[1], $^a}).grep({$^a[0]}).elems}; say $s/$x' 1.5274There's still a bit of debugging/checking things extra characters in there.
posted by zengargoyle at 10:19 PM on September 26, 2021
I totally missed that my attempt to clarify that the answer is the mode was already set out - so 1 - 1 is what you should expect
posted by birdsquared at 11:11 PM on September 26, 2021
posted by birdsquared at 11:11 PM on September 26, 2021
linearity of expectation
Mind blown. It really does seem too good to be true!
I also appreciated the python and perl examples I could drop into the terminal and see the magic for myself. (The python example also saw me learn how to calculate expected values in Excel, which was nice.) Finally, thanks to hoyland for circling the hole I'd already fallen into and prompting some really interesting explanations for why it wasn't there.
The context:
There's an abstract-ish print and play game called Bahama Taxi. Pairs of cards are drawn to represent a passenger. The first card shows their pickup location, and the other their destination.
I'm thinking about retheming this as a 'deliver cargo to different systems' Elite-style affair, and wondering about mechanisms to spice it up without changing the central "draw two cards" mechanic. Perhaps there are space pirates, asteroids, or some sort of wormhole.
I wondered whether somebody drawing a pair was a frequent enough occurrence to generate a decent number of such events before the deck was depleted. The answer is clearly "probably not"! Then again, maybe there only needs to be a couple of things to make it interesting - one wormhole that lets you jump to any point in the quadrant, or one pirate you can choose to move on your turn to snafu your opponents.
posted by some little punk in a rocket at 2:06 AM on September 27, 2021
Mind blown. It really does seem too good to be true!
I also appreciated the python and perl examples I could drop into the terminal and see the magic for myself. (The python example also saw me learn how to calculate expected values in Excel, which was nice.) Finally, thanks to hoyland for circling the hole I'd already fallen into and prompting some really interesting explanations for why it wasn't there.
The context:
There's an abstract-ish print and play game called Bahama Taxi. Pairs of cards are drawn to represent a passenger. The first card shows their pickup location, and the other their destination.
I'm thinking about retheming this as a 'deliver cargo to different systems' Elite-style affair, and wondering about mechanisms to spice it up without changing the central "draw two cards" mechanic. Perhaps there are space pirates, asteroids, or some sort of wormhole.
I wondered whether somebody drawing a pair was a frequent enough occurrence to generate a decent number of such events before the deck was depleted. The answer is clearly "probably not"! Then again, maybe there only needs to be a couple of things to make it interesting - one wormhole that lets you jump to any point in the quadrant, or one pirate you can choose to move on your turn to snafu your opponents.
posted by some little punk in a rocket at 2:06 AM on September 27, 2021
Yeah, linearity of expectation really is kind of magic, and I always have to pinch myself to remind myself that it's justified. Here's a little rabbit hole about why it works:
Imagine that you decide to solve this problem by brute force by setting up a ridiculously giant spreadsheet. There are 52! rows, one for each possible deck shuffle. There are 26 main columns, one for each player. In row R column C you put a 1 if player C got a pair when the deck configuration was R and 0 otherwise (this is called an "indicator variable"). In a final column you sum up all the 0-1 cells for that row; this is the number of pairs for that deck configuration. At the very bottom right, you average that final column; that's your answer.
Note that all you did to get that final answer was add up every single 0-1 cell in the whole spreadsheet and divide by 52!.
So let's do that a different way. In that final row, for column C, average all the 52! cells for player C. This is the chance that they were dealt a pair (and, by symmetry, it will be the same for every column), which is the same thing as the average number of pairs they were dealt. Then add all the values in that row to get the final answer. That's the calculation I made in my reply. All we did was once again add up all the 0-1 cells and divide by 52!, we just did it in a different order.
That's all linearity of expectation is, just summing the whole rectangle by columns instead of by rows. It turns out that a lot of the time that's a hugely more simple calculation.
Thank you for the good example!
posted by dfan at 3:59 AM on September 27, 2021
Imagine that you decide to solve this problem by brute force by setting up a ridiculously giant spreadsheet. There are 52! rows, one for each possible deck shuffle. There are 26 main columns, one for each player. In row R column C you put a 1 if player C got a pair when the deck configuration was R and 0 otherwise (this is called an "indicator variable"). In a final column you sum up all the 0-1 cells for that row; this is the number of pairs for that deck configuration. At the very bottom right, you average that final column; that's your answer.
Note that all you did to get that final answer was add up every single 0-1 cell in the whole spreadsheet and divide by 52!.
So let's do that a different way. In that final row, for column C, average all the 52! cells for player C. This is the chance that they were dealt a pair (and, by symmetry, it will be the same for every column), which is the same thing as the average number of pairs they were dealt. Then add all the values in that row to get the final answer. That's the calculation I made in my reply. All we did was once again add up all the 0-1 cells and divide by 52!, we just did it in a different order.
That's all linearity of expectation is, just summing the whole rectangle by columns instead of by rows. It turns out that a lot of the time that's a hugely more simple calculation.
Thank you for the good example!
posted by dfan at 3:59 AM on September 27, 2021
I agree with the linearity of expectation, which gives (3/51)*26 = 26/17 = 1.53.
The problem with linearity of expectation, though, is that it doesn't give you a good idea of how spread out the distribution is, which sounds like it might be important for your game mechanism. A much longer calculation involving quadruplets of cards could get you the variance, and in a past life I was good at that sort of thing. But I'm not any more, so let's simulate a million decks.
In R:
posted by madcaptenor at 6:52 AM on September 27, 2021
The problem with linearity of expectation, though, is that it doesn't give you a good idea of how spread out the distribution is, which sounds like it might be important for your game mechanism. A much longer calculation involving quadruplets of cards could get you the variance, and in a past life I was good at that sort of thing. But I'm not any more, so let's simulate a million decks.
In R:
> f = function(){s = sample(52) %% 13; return(sum(s[seq(1,51,2)] == s[seq(2,52,2)]))} > x = replicate(1000000, f()) > table(x) x 0 1 2 3 4 5 6 7 8 9 10 209869 334512 258986 130747 48318 13703 3134 605 108 16 2 > mean(x) [1] 1.530579which basically agrees with miguelcervantes' simulation. In particular, there's a 21% chance of no pairs, and a 79% chance of at least one pair.
posted by madcaptenor at 6:52 AM on September 27, 2021
This thread is closed to new comments.
posted by dfan at 5:36 AM on September 26, 2021