Manufacturing randomness over email!
January 31, 2011 3:15 PM   Subscribe

Statistics whizzes, help settle an argument! How random are the results if a friend thinks of a number from 1-100, I think of a 1-100 number separately and then we add them together (subtracting 100 if the result is over 100)? So: Friend thinks of 29, I think of 90, result is 19. To me that seems properly random even if one or both of us tries to game it, but am I being statistically naive?
posted by Sebmojo to Science & Nature (29 answers total) 2 users marked this as a favorite
 
For starters, the numbers people think of when asked "think of a random number between 1 and 100" are not very random at all: people are much more likely to chose certain numbers.

People are not very good random number generators. You need a better source of randomness :)
posted by pharm at 3:22 PM on January 31, 2011 [6 favorites]


What do you mean by "statistically naive"?

If you choose a number at random and your friend chooses a number at random the result of subtracting the two will be random.

To test this, do the test 100 times. You'll see that you rarely will get the same end result, as should be expected from random chance.
posted by dfriedman at 3:22 PM on January 31, 2011


If both inputs are truly random, the outputs are truly random.

But as pharm says, it's very hard for humans to be truly random ... especially, ironically, if they're trying to be random. If you were to plot it out with enough data points, you'd see it in an instant.
posted by Cool Papa Bell at 3:28 PM on January 31, 2011 [1 favorite]


Yeah, I'd slightly modify my response to take into account the notion that people are not really random in decisions of these kinds.

But, assuming that people were capable of choosing numbers randomly, the outcome would be random as well.
posted by dfriedman at 3:30 PM on January 31, 2011


It is random only inasmuch as people can choose random numbers (they don't - asked to pick a random number between 1 and 10, people are more likely to pick 3 or 7 and less likely to pick 1 or 10). It is possible to game this system by intuiting which number your friend is more likely to select.
posted by Paragon at 3:30 PM on January 31, 2011


I think this will depend a lot on what you mean by "how random." For settling informal disputes between friends, like whose turn it is to bring beer to the poker game or something, this seems fine. Of course that's not a very high standard.

If you mean something else, you'll have to clarify. One link that might be of interest to you is this Sept. 2009 post from the old 538 blog indicating that "human beings are really bad at randomization. Tell a human to come up with a set of random numbers, and they will be surprisingly inept at trying to do so. Most humans, for instance, when asked to flip an imaginary coin and record the results, will succumb to the Gambler's Fallacy and be more likely to record a toss of 'tails' if the last couple of tosses had been heads, or vice versa." (from the link).

He goes on to point out some specific errors that people make in the situation you bring up, such as picking "7" as a final digit way too often. Is that the kind of thing you mean? If so, the answer to your question is that your method is, statistically speaking, not at all likely to be really random.
posted by rkent at 3:31 PM on January 31, 2011


Yeah, depends on whether you need truly random or "random enough." If humans were capable of being truly random, your method would be, too.

But research (that 538.org rkent links mentions some of it) has shown that people tend to confuse "random" with "unusual". A nice round number like 40 is less likely to be chosen, because people think it's too common a number and steer away from it. If both of your participants steer away from certain numbers, your final results will also have some outcomes more likely than others.
posted by Chanther at 3:37 PM on January 31, 2011


Best answer: What pharm said. People are bad at doing random, and you can't generate 'proper' randomness by combining two not-particularly-random selections. It might be 'random enough' to satisfy a particular purpose, which is generally the criterion for most applications of 'random' selection.

ScienceBlogs' Cognitive Daily had a few threads comparing 'think of a number' numbers to computer-generated [pseudo]random numbers, and you might get something from the In Our Time on the topic.
posted by holgate at 3:40 PM on January 31, 2011


It's not gameable assuming simultanaeity. Except that's a big assumption: how you know that the other guy didn't wait to see your email before sending his?

To solve this, see "Coin flipping by telephone a protocol for solving impossible problems", by M. Blum.
posted by novalis_dt at 4:01 PM on January 31, 2011 [1 favorite]


Best answer: Let X1, X2, ... Xn be independent random variables that have the same distribution function. Then Y = X1 + X2 + Xn also has distribution function m.

In your case, a random number between 1 and 100 has a uniform distribution. And I am supposing that you both are choosing your numbers independently of each other. Therefore, the sum is also uniformly distributed.
posted by sbutler at 4:14 PM on January 31, 2011


Response by poster: Right. To clarify, it's for a game - I'm asking him for the number when I've already thought of a number, then I add them together and subtract 100 if necessary. He definitely wants the number to be high (he's the player), whereas I don't really care (DM).

I don't see how it's particularly gameable (though the point about numbers in the middle being perceived as 'less random' is an interesting one).
posted by Sebmojo at 4:19 PM on January 31, 2011


Don't people tend to overwhelmingly favour odd numbers over even when selecting random numbers? I can't think of a source off the top of my head.
posted by obiwanwasabi at 4:24 PM on January 31, 2011


Response by poster: Sbutler that's right, but of course the numbers aren't random - they're being chosen by him on the basis of percieved value (ie what's going to ultimately produce the highest result).

So he'd go low because I'd go high but I'd know that so I'd go low but he'd know that so I'd go high...

It's all very Princess Bride :)
posted by Sebmojo at 4:27 PM on January 31, 2011


To remove the human factor: you have two six-sided dice and you take their sum modulo^ six. If one of the dice is loaded and one is honest, is the result statistically equivalent to rolling one honest die?

I don't feel like recalling enough stats for this, but I'm pretty sure there's a way to write that down as an equation and derive a proof that they're equivalent. (edit: oops, too late, sbutler got it.) Instead, for all the die-hard pen-and-paper RPG players in the room, I'll link to this MeFi post from 2008 I just came across featuring a Youtube dissertation by Colonel Zocchi himself (inventor of the Zocchihedron!) concerning the design and manufacture of more accurate polyhedral dice.
posted by XMLicious at 4:32 PM on January 31, 2011


sbutler: I thought that the sum of RVs would result in the convolution of the PDFs. So the sum of two uniform RV's is not uniform.

At least that's what I remember from my stats course 20 years ago.
posted by schrodycat at 4:49 PM on January 31, 2011 [1 favorite]


Actually, hang on a second sbutler - don't you actually want the sum of two distributions, do you? I would think you'd want Y = X + a, a being a constant, because we're assuming that one of the inputs isn't random. (Though I think there's a theorem proving that Y has the same distribution as X too in that case, probably somewhere in sbutler's link.)
posted by XMLicious at 5:05 PM on January 31, 2011 [1 favorite]


That should be don't you don't actually want the sum of two distributions, do you?
posted by XMLicious at 5:07 PM on January 31, 2011


SButler: The numbers are most certainly not uniform; the distribution is a triangular distribution. Think of it: in order for the sum to be 2, both of you have to pick 1. However, for the total of 101, there are 100 ways of getting that sum: 1 + 100, 2+99, 3+98, ... , 100 + 1 so it's more likely.

Now, the X + Y mod 100 will be purely random when either X or Y is uniform from 1 to 100. Think of it this way: no matter what X is, there is exactly one Y to make (X+Y)mod 100 = 0, exactly one Y tom make it equal to 1, etc ... and since Y is chosen uniformly at random, so will the final result.
posted by bsdfish at 5:10 PM on January 31, 2011 [3 favorites]


Best answer: So he'd go low because I'd go high but I'd know that so I'd go low but he'd know that so I'd go high...

This is the key to your problem. The optimal strategy in your game is to generate a random number, since anything else would be predictable by the opponent. And if everyone's following their optimal strategy, the winner will be random. Sure, it's a hard strategy for humans to actually follow due to the way our brains are wired, but in this game there's an incentive to follow it.

In game theory terms, there is a "mixed strategy equilibrium" -- that term might help you find out more.
posted by miyabo at 5:14 PM on January 31, 2011


I just want to comment on the statistics of this question a little more. sbutler linked to a stats pdf, and I would point you to page 4, which clearly shows that the sum of uniform variables is NOT a uniform distribution, but a normal distribution (a bell curve). For instance, when you roll two 6 sided dice, the most common value is 7 (as any Settler's player will tell you). It's no longer an equal probability of 2-12.

The same thing is happening here. If we roll two 100 sided dice, the most common value you will get is 100.

However, that's not what you asked! You asked for the sum modulo 100. Which is great, because every "die roll" of 1-100 is uniform modulo 100, and you are left, again, with a uniform distribution.
posted by Phredward at 6:38 PM on January 31, 2011 [1 favorite]


I should have mentioned that I hate stats. I admit I'm wrong and too busy to read even two more pages into a PDF I cite!
posted by sbutler at 8:47 PM on January 31, 2011


Both inputs don't even have to be random. One random, uniformly distributed input will be enough to create a random, uniformly distributed output here.

My hunch is that there exists a simpler way to realize your goal, if your goal is simply to generate a random number from 1-100 in a verifiable way. Note here that, if the number you choose is truly random, the number he chooses doesn't matter. It doesn't matter at all. So why even have that step in there? Any participation on his part is an illusion, a distraction from what in fact reduces to a very simple process of you picking a random number.

One verifiable option: You use a random number generator to make a long list of random numbers (numbered for easy reference) like so:
1: 8
2: 16
3: 34
4: 10
5: 79

Save that list to a text file, encrypt it, and email it to him. In your duties as DM, when you need a random number start at the top of the list and work your way down, crossing them off as you go. At any point you can send him the decryption key and he can verify that you aren't cheating (of course you'll need to use a new list after he decrypts the first one).
posted by kprincehouse at 8:56 PM on January 31, 2011


(hm, although the list method I suggested doesn't allow him to verify how the list was generated in the first place... so if he's worried about you stacking the deck, that solution won't work.)
posted by kprincehouse at 9:01 PM on January 31, 2011


kprincehouse,

You can use a hash function to generate numbers that two parties agree are random.

Alice chooses up R, sends hash(R) to Bob
Bob chooses S, sends hash(S) to Alice
Alice sends R to Bob
Bob sends S to Alice
Bob verifies hash(R), Alice verifies hash(S)
Alice and Bob use hash(R||S) as their random number for their game.

The reason you need both parties to generate independent values is that otherwise, Alice could try 1000 different choices for R and pick the most advantageous in the context of the game.

Diffie-Hellman key exchange is a much smarter but more mathematically complicated way for two parties to agree on a random number -- without even ever sending the number over the wire!
posted by miyabo at 9:17 PM on January 31, 2011


> sbutler linked to a stats pdf, and I would point you to page 4, which clearly shows that the sum of uniform variables is NOT a uniform distribution, but a normal distribution (a bell curve).

Of course this is also not quite correct: the sum of finitely many uniform random variables is neither uniform nor normal. Infinitely many are needed for their mean to be normally distributed. This is the Central Limit Theorem.

Your original solution is correct (as is the proof given by bsdfish. Another way to see this is to imagine taking the triangle function of the X+Y distribution, cutting the right half off, and adding it to the left half). And even though neither player will pick uniformly (by virtue of being human), this effect is harder to exploit when two people are involved.
posted by Talisman at 10:36 PM on January 31, 2011


To my mind, the answer is you are being 'statistically naive', because both numbers come from human driven mental stuff, which, as many have mentioned, exclude perfectly common results.

The ideal solution is to find an online random number generator and use the results. Use the same one - you will have to trust in that. Amusingly enough, if one person uses a different method, that equates to a random seed.
posted by Sparx at 10:39 PM on January 31, 2011 [1 favorite]


Don't sit around and philosophize—test! You should test to see how random it is. Do it 100 times and then plot the results. Do they look random? Or are they grouped? My hypothesis is that you will see grouping... at least until one or both of you get better at being more random. Compare with a computer program that uses a reasonable random number generator.
posted by jeffamaphone at 11:39 PM on January 31, 2011


If you are picking really random numbers in a particular way, it doe not matter what the other fellow picks. That is the principal of a one time pad.
posted by a robot made out of meat at 8:27 AM on February 1, 2011


Best answer: What bsdfish said; the unadjusted sum of the two numbers (assuming you were picking them uniformly and independently of each other) would definitely not be uniformly distributed. However, since you're subtracting 100 if necessary (i.e., taking the sum modulo 100), the result will be uniformly distributed between 1 and 100, which is what I'm guessing you mean by "random."

To take a smaller example, if you were only picking numbers between 1 and 6, adding them, and subtracting 6 if the result were greater than 6, the table of possible outcomes would look like this:

....1 2 3 4 5 6
-----------------
1| 2 3 4 5 6 1
2| 3 4 5 6 1 2
3| 4 5 6 1 2 3
4| 5 6 1 2 3 4
5| 6 1 2 3 4 5
6| 1 2 3 4 5 6

Here, the row is the first player's pick and the column is the second player's pick. By assumption, all the possible pairs are uniformly likely (big assumption, for the reasons other people have mentioned). And you can see that each number 1 through 6 appears an equal number of times (6 out of 36) in the results table.
posted by albrecht at 12:39 PM on February 1, 2011


« Older What IRA to choose?   |   My Daughter Wants To Meet a Ballet Dancer Newer »
This thread is closed to new comments.