November 3, 2006 11:00 AM Subscribe

1 is more common than 2? Help me find a recent article that talks about "data" having a tendency to start with 1.

I recently (last ten days) saw an article from a study that said that numbers that begin with 1 such as 12,000, 112,322, 1,022, were much more common in "data" than similar numbers that started with 9, for example. They went on to say that accountants (or the IRS maybe?) use this to determine whether false data follows the same pattern. Basically the point was that in real-world data we would think that 123 would be as common as 923, but in fact numbers that begin with 1 are more common. It doesn't make sense to me now, so I wanna re-read it.

My google fu isn't working because I want to search on "numbers" "1" "9" "accountant" "distribution". I read digg, reddit, lifehacker, mefi, and slashdot and it is likely it came from one of those.
posted by wogbat to Computers & Internet (7 answers total)

I recently (last ten days) saw an article from a study that said that numbers that begin with 1 such as 12,000, 112,322, 1,022, were much more common in "data" than similar numbers that started with 9, for example. They went on to say that accountants (or the IRS maybe?) use this to determine whether false data follows the same pattern. Basically the point was that in real-world data we would think that 123 would be as common as 923, but in fact numbers that begin with 1 are more common. It doesn't make sense to me now, so I wanna re-read it.

My google fu isn't working because I want to search on "numbers" "1" "9" "accountant" "distribution". I read digg, reddit, lifehacker, mefi, and slashdot and it is likely it came from one of those.

I remember the article, and I don't have a link to it, but I think I can summarize the reasoning. Financial data is a type of data that naturally accumulates, that is, it is generally incremented or decremented by discrete amounts. In other words, itâ€™s distribution follows a pattern that is different from a random, or gaussian, distribution.

So, say you start with $10. To change the first digit to a 2, the total value must increase by 100%. To then change the first digit to a 3, the value must increase by only 50%. Then, to for the first digit to change to a 4, the total value must increase by only 33%. When you finally get to a value having 9 in the first digit, it only takes an 11% increase to change the first digit back to 1.

This pattern means that having a 1 in the first digit should occur more often, statistically, than any other number.

posted by ijoshua at 11:11 AM on November 3, 2006

So, say you start with $10. To change the first digit to a 2, the total value must increase by 100%. To then change the first digit to a 3, the value must increase by only 50%. Then, to for the first digit to change to a 4, the total value must increase by only 33%. When you finally get to a value having 9 in the first digit, it only takes an 11% increase to change the first digit back to 1.

This pattern means that having a 1 in the first digit should occur more often, statistically, than any other number.

posted by ijoshua at 11:11 AM on November 3, 2006

This is all great, and yes this is what I was talking about...I know more now than when I first read "the article". Anyone else remember the actual article/post I'm talking about? I bet it is on page 1 or 11 or 111.

I just drives me nuts when I can't retrace my steps even though I've already received more than I hoped in the above linked articles.

posted by wogbat at 11:22 AM on November 3, 2006

I just drives me nuts when I can't retrace my steps even though I've already received more than I hoped in the above linked articles.

posted by wogbat at 11:22 AM on November 3, 2006

I found it!

http://plus.maths.org/issue9/features/benford/

I google'd Benford and found the article....Now to solve the puzzle....how did I get there in the first place from reddit, slashdot, digg, etc?

(the other articles that were submitted were much clearer to me, so many thanks!)

posted by wogbat at 11:37 AM on November 3, 2006

http://plus.maths.org/issue9/features/benford/

I google'd Benford and found the article....Now to solve the puzzle....how did I get there in the first place from reddit, slashdot, digg, etc?

(the other articles that were submitted were much clearer to me, so many thanks!)

posted by wogbat at 11:37 AM on November 3, 2006

Mathematician here. Benford's law has been known for a while, and it applies not just to financial data.

Think about it like this: out of the first 10 positive integers, two of them start with 1. But out of the first 20, 11 of them start with 1. Out of the first 100, 12 of them start with 1. But out of the first 200, 112 of them start with 1. And so on.

This phenomenon appears in many "naturally occurring" data sets. For instance, if we consider all the areas of rivers in the United States, measured in square miles, the list of numbers we get would obey Benford's Law. But also if we measured in square inches, the numbers would obey Benford's Law. In fact, this is exactly the property that is needed to prove Benford's Law: scale invariance. For those of you that understand what this means, it comes down to the fact that Haar measure on the circle (i.e. Lebesgue measure) is unique.

posted by number9dream at 12:09 AM on November 4, 2006

Think about it like this: out of the first 10 positive integers, two of them start with 1. But out of the first 20, 11 of them start with 1. Out of the first 100, 12 of them start with 1. But out of the first 200, 112 of them start with 1. And so on.

This phenomenon appears in many "naturally occurring" data sets. For instance, if we consider all the areas of rivers in the United States, measured in square miles, the list of numbers we get would obey Benford's Law. But also if we measured in square inches, the numbers would obey Benford's Law. In fact, this is exactly the property that is needed to prove Benford's Law: scale invariance. For those of you that understand what this means, it comes down to the fact that Haar measure on the circle (i.e. Lebesgue measure) is unique.

posted by number9dream at 12:09 AM on November 4, 2006

This thread is closed to new comments.

posted by jplank at 11:06 AM on November 3, 2006