September 16, 2012 5:10 PM Subscribe

Does Benford's Law apply to Numbers Games?

Based on the mechanism for choosing winning numbers it seems like Benford's law would apply to Numbers Games, skewing the results slightly towards lower digits. Is that right?
posted by Tell Me No Lies to Science & Nature (10 answers total)

Based on the mechanism for choosing winning numbers it seems like Benford's law would apply to Numbers Games, skewing the results slightly towards lower digits. Is that right?

''It tends to be most accurate when values are distributed across multiple orders of magnitude."

To use Benford's law, one would would have to be playing a variety of different games with outcomes of many different sizes.

posted by wobh at 5:30 PM on September 16, 2012

To use Benford's law, one would would have to be playing a variety of different games with outcomes of many different sizes.

posted by wobh at 5:30 PM on September 16, 2012

Actually it applies to all of the digits, just with increasingly less accuracy for each place. In addition it is not expected to work on random numbers.

I'm not looking for high accuracy, just statistical significance.

Since the final numbers are a sum of many smaller numbers Of lesser magnitudes (if I'm understanding the mechanism correctly) I keep thinking it should apply.

posted by Tell Me No Lies at 7:24 PM on September 16, 2012

I have only a layman's familiarity with Benford's Law—I've applied it to large datasets a few times, mostly for my own amusement—but I do believe you're right!

posted by waldo at 7:50 PM on September 16, 2012

posted by waldo at 7:50 PM on September 16, 2012

"increasingly less accuracy with each place" is another way of saying "if it applies, its influence on the outcome is not great". However, you should feel to try and graph it and see. The empirical distribution of the numbers is a fact, and one should not argue about facts, and instead one should graph them and look at them and argue about the conclusions.

Whenever I have looked for Benford's Law effects on digits past the first or second, I have not found a strong relation. Doesn't mean it's not there, just means I didn't manage to see it.

posted by pmb at 7:53 PM on September 16, 2012

Whenever I have looked for Benford's Law effects on digits past the first or second, I have not found a strong relation. Doesn't mean it's not there, just means I didn't manage to see it.

posted by pmb at 7:53 PM on September 16, 2012

I assume you're talking about the race track 'handle' method of choosing the winning numbers, as I can't see any particular reason the other methods--particularly spinning a wheel, etc--would follow Benford's law.

I found a pretty good source of some 'handle' numbers and because I obviously don't have anything better to do, analyzed a year's worth of them. It would be better to have 5-10-20 years, but this is a start.

Analyzing the first digit gives this distribution:

35 1

65 2

45 3

13 4

22 5

32 6

7 7

3 8

3 9

That is clearly somewhat Benford-like, but the difference is that the most of these numbers have an expected range. In one column, almost all of them fall between 200K and 500K. so that puts a definite bias on the digits in the first column--but not necessarily exactly a Benford distribution.

When I analyze the dollar digit (the one just to the left of the decimal point) the difference is pretty striking:

16 0

16 1

19 2

18 3

19 4

19 5

22 6

28 7

28 8

22 9

This looks to me like a smooth distribution, or if there is a bias to it, it is pretty small and going to take a lot more than a sample of 200-300 to tease it out.

Just for comparison, here is a sample of purely random numbers about the same size as my sample above:

23 0

20 1

16 2

20 3

17 4

13 5

21 6

24 7

21 8

25 9

posted by flug at 8:02 PM on September 16, 2012 [1 favorite]

I found a pretty good source of some 'handle' numbers and because I obviously don't have anything better to do, analyzed a year's worth of them. It would be better to have 5-10-20 years, but this is a start.

Analyzing the first digit gives this distribution:

35 1

65 2

45 3

13 4

22 5

32 6

7 7

3 8

3 9

That is clearly somewhat Benford-like, but the difference is that the most of these numbers have an expected range. In one column, almost all of them fall between 200K and 500K. so that puts a definite bias on the digits in the first column--but not necessarily exactly a Benford distribution.

When I analyze the dollar digit (the one just to the left of the decimal point) the difference is pretty striking:

16 0

16 1

19 2

18 3

19 4

19 5

22 6

28 7

28 8

22 9

This looks to me like a smooth distribution, or if there is a bias to it, it is pretty small and going to take a lot more than a sample of 200-300 to tease it out.

Just for comparison, here is a sample of purely random numbers about the same size as my sample above:

23 0

20 1

16 2

20 3

17 4

13 5

21 6

24 7

21 8

25 9

posted by flug at 8:02 PM on September 16, 2012 [1 favorite]

Does a version of Benford's law apply to the second digit? Yes, though the non-uniformity is very weak indeed. Does a version of Benford's law apply to the third digit? Yes, though now it's so weak that only a gigantic sample would allow you to reliably tell the difference between Benford and uniform distribution. Does a version of Benford's law apply to the kth digit for any k? Yes.

But does a version of Benford's law apply to the

Well, what if you only look at numbers with a fixed number of digits, like 5? You're welcome to do that -- say, look at a set of numbers whose values naturally fluctuate between 250,000 and 500,000. But once you do that you've killed off the essential "distributed across multiple orders of magnitude," and with it the Benford distribution; in the given example, the first digit is NEVER 1, indeed is never anything other than 2,3,4, or 5; very un-Benfordlike behavior.

The best way to get clear on this is to do what flug did above. Look at, say, vote totals from a bunch of recent elections. The first digits should look Benford-like. The last digits should look uniform.

posted by escabeche at 8:17 PM on September 16, 2012

Or to put it another way, given the example in the Wikipedia article (Win $1004.25, Place $583.56, Show $27.61 and you choose the 'dollar' place to get 437) if those numbers are typical daily totals for Win, Place, and Show then the Win digit is probably pretty well randomized, the Place digit a little less so but probably not enough to give an advantage worth pursuing, but the Show--that could be a different story.

Let's say the Show is a normal distribution centered on $20 with a standard deviation of 10.

Using Excel to similate 65000 rounds of at that distribution and then counting up the 2nd digit gives this distribution:

0 10.6%

1 10.3%

2 10.2%

3 10.2%

4 10.0%

5 10.0%

6 9.7%

7 9.6%

8 9.7%

9 9.6%

So . . . that's an obvious and usable bias. However, this is pretty sensitive to particularities of the data. For example, centering the distribution on $30 with SD of $10 gives this, which is about as close to random as you'll find:

0 10.2%

1 9.9%

2 10.1%

3 9.9%

4 10.0%

5 10.0%

6 9.8%

7 10.1%

8 10.0%

9 10.1%

If you change that to centered on $500 with $200 SD (so now you're looking at the 3rd digit, usually) then you're looking at a very random looking distribution again:

0 10.3%

1 10.0%

2 9.8%

3 9.9%

4 10.0%

5 10.0%

6 10.0%

7 9.9%

8 9.9%

9 10.2%

So, in short, there could be some kind of Benford-like skew in the digits in some very specific situations where the typical numbers are small. Particularly if you're choosing the smallest dollar digit and the total dollar number is typically far smaller than $100.

But if you're choosing the smallest dollar digit and the numbers are in the hundreds to thousands of dollars or more, there won't be any usable bias to the numbers.

posted by flug at 8:43 PM on September 16, 2012

Let's say the Show is a normal distribution centered on $20 with a standard deviation of 10.

Using Excel to similate 65000 rounds of at that distribution and then counting up the 2nd digit gives this distribution:

0 10.6%

1 10.3%

2 10.2%

3 10.2%

4 10.0%

5 10.0%

6 9.7%

7 9.6%

8 9.7%

9 9.6%

So . . . that's an obvious and usable bias. However, this is pretty sensitive to particularities of the data. For example, centering the distribution on $30 with SD of $10 gives this, which is about as close to random as you'll find:

0 10.2%

1 9.9%

2 10.1%

3 9.9%

4 10.0%

5 10.0%

6 9.8%

7 10.1%

8 10.0%

9 10.1%

If you change that to centered on $500 with $200 SD (so now you're looking at the 3rd digit, usually) then you're looking at a very random looking distribution again:

0 10.3%

1 10.0%

2 9.8%

3 9.9%

4 10.0%

5 10.0%

6 10.0%

7 9.9%

8 9.9%

9 10.2%

So, in short, there could be some kind of Benford-like skew in the digits in some very specific situations where the typical numbers are small. Particularly if you're choosing the smallest dollar digit and the total dollar number is typically far smaller than $100.

But if you're choosing the smallest dollar digit and the numbers are in the hundreds to thousands of dollars or more, there won't be any usable bias to the numbers.

posted by flug at 8:43 PM on September 16, 2012

One last--here is a distribution centered at $9000 with SD of $3000 and we are looking at the last (least significant) dollar digit. We would expect this to to pretty much a random distribution and it is:

0 9.9%

1 10.0%

2 10.0%

3 10.0%

4 10.1%

5 9.9%

6 10.0%

7 9.8%

8 10.0%

9 10.2%

posted by flug at 8:50 PM on September 16, 2012

0 9.9%

1 10.0%

2 10.0%

3 10.0%

4 10.1%

5 9.9%

6 10.0%

7 9.8%

8 10.0%

9 10.2%

posted by flug at 8:50 PM on September 16, 2012

If you are playing number games that contain the same bias as electricity bills for instance, then the Benford's observation would apply. With purely random number games, Benford would not apply.

In a casino, there is a clear bias toward the house. I wonder whether that bias could be taken advantage of using Benford-type observations of bias.

posted by lake59 at 6:57 AM on September 17, 2012

In a casino, there is a clear bias toward the house. I wonder whether that bias could be taken advantage of using Benford-type observations of bias.

posted by lake59 at 6:57 AM on September 17, 2012

This thread is closed to new comments.

posted by dfan at 5:29 PM on September 16, 2012