March 18, 2012 4:58 PM Subscribe

Math/sports question! I entered a "Square pool" (purely theoretical money at stake :) ), for the NCAA basketball tournament. I am curious if there are better numbers to have, and why that would be the case. More information inside.

This type of pool is very popular for the Super Bowl. You create a 10 x 10 grid, and people purchase squares (let's say five bucks a piece.) Once all the squares have been purchased, you randomly assign digits 0-9 to the rows and columns, as well as a team. Each square corresponds to the last digit of the score. So if the super bowl ended 42-35, Giants beating the Patriots, the person who had the square Giants 2, Patriots 5, would win.

Now in football, there are clearly a few numbers which are better to have. By virtue of the scoring opportunities in football, you're obviously going to do OK with numbers like 7, 4, 1 or 0. Numbers like 5 and 2 don't come up as frequently, just because these require less likely scenarios like kicking four field goals to get to twelve points.

However, I am curious if there are ideal numbers to have for the**basketball version** of this pool. So far this year, there have been a couple number combinations that have come up more than once. I am guessing this is just random odds, but I was wondering if anyone who knows statistics or probability or just plain math better than I do might be able to offer some advice or theories about which numbers would be better to have in a pool like this for basketball.
posted by JoeGoblin to Sports, Hobbies, & Recreation (7 answers total) 1 user marked this as a favorite

This type of pool is very popular for the Super Bowl. You create a 10 x 10 grid, and people purchase squares (let's say five bucks a piece.) Once all the squares have been purchased, you randomly assign digits 0-9 to the rows and columns, as well as a team. Each square corresponds to the last digit of the score. So if the super bowl ended 42-35, Giants beating the Patriots, the person who had the square Giants 2, Patriots 5, would win.

Now in football, there are clearly a few numbers which are better to have. By virtue of the scoring opportunities in football, you're obviously going to do OK with numbers like 7, 4, 1 or 0. Numbers like 5 and 2 don't come up as frequently, just because these require less likely scenarios like kicking four field goals to get to twelve points.

However, I am curious if there are ideal numbers to have for the

Benford's Law would apply to the _first_ digit of a different type of counting phenomena. In this case, you could take a pretty good crack at the first digit being 6, 7, or 8 probably 90% of the time. The last digit (the one you are looking for here) is going to be far more random.

If I were going to venture a guess, I would do so by first looking up the final scores to all of last years games, mod 10-ing them and looking for a pattern.

We did something like that here. Good luck.

posted by milqman at 5:33 PM on March 18, 2012 [1 favorite]

If I were going to venture a guess, I would do so by first looking up the final scores to all of last years games, mod 10-ing them and looking for a pattern.

We did something like that here. Good luck.

posted by milqman at 5:33 PM on March 18, 2012 [1 favorite]

Shouldn't be too hard to find out - grab the scores for a few years worth of games in the particular league, ditch all but the least significant digit, and do a histogram of the frequency of each combination.

The only real trick beyond that would be to account for which team won. Assuming that the assignment of teams to an axis is random, you shouldn't need to - but if not (e.g. lowest team alphabetically is always on the Y axis) then you may be subject to some unaccounted-for bias (i.e. do teams starting with A-L, or "odd-numbered" alphabetical names, win/lose more often?)

On preview: basically, what milqman said.

posted by Pinback at 5:39 PM on March 18, 2012

The only real trick beyond that would be to account for which team won. Assuming that the assignment of teams to an axis is random, you shouldn't need to - but if not (e.g. lowest team alphabetically is always on the Y axis) then you may be subject to some unaccounted-for bias (i.e. do teams starting with A-L, or "odd-numbered" alphabetical names, win/lose more often?)

On preview: basically, what milqman said.

posted by Pinback at 5:39 PM on March 18, 2012

On consideration: if the axis for the team is determined by something like home vs away status, or some other 'theoretically randomised by the league, but potentially outcome-biasing' factor, then definitely factor it in.

posted by Pinback at 5:43 PM on March 18, 2012

posted by Pinback at 5:43 PM on March 18, 2012

Some more info:

-In the basketball pool, one axis is for the Lower seed number (favored team), and the other is for the higher seed number (underdog).

-As far as I can tell, in the pool I'm in, the only number that matters is the final score. My initial instinct is to believe that having the same number on both axis, 6-6 or 9-9, is a disadvantage, since a game would have to end by a ten point differential to make this happen, but this may just be an uniformed opinion. (A friend is in a pool that sidesteps this by having the score at the end of regulation count, so a tie is a possible outcome.)

I am mainly curious to see if the basketball strategy impacts the result at all, since many of these games seem to end with the team that is losing fouling a lot in the final two minutes and then taking faster, riskier plays that don't pay off as much when they get on offense. My numbers are Underdog: 0 and Favorite: 1, which I thought might work out OK, but it hasn't hit yet. I wonder if having numbers that are further apart might be better.

Anyways, thanks for any insights you can provide!

posted by JoeGoblin at 5:44 PM on March 18, 2012

-In the basketball pool, one axis is for the Lower seed number (favored team), and the other is for the higher seed number (underdog).

-As far as I can tell, in the pool I'm in, the only number that matters is the final score. My initial instinct is to believe that having the same number on both axis, 6-6 or 9-9, is a disadvantage, since a game would have to end by a ten point differential to make this happen, but this may just be an uniformed opinion. (A friend is in a pool that sidesteps this by having the score at the end of regulation count, so a tie is a possible outcome.)

I am mainly curious to see if the basketball strategy impacts the result at all, since many of these games seem to end with the team that is losing fouling a lot in the final two minutes and then taking faster, riskier plays that don't pay off as much when they get on offense. My numbers are Underdog: 0 and Favorite: 1, which I thought might work out OK, but it hasn't hit yet. I wonder if having numbers that are further apart might be better.

Anyways, thanks for any insights you can provide!

posted by JoeGoblin at 5:44 PM on March 18, 2012

This site (search for "basketball") has a link of old NCAA basketball scores you can play around with. (They're old enough that the first link I found to the dataset was a gopher link.) It covers ~1700 games to 1995, and unless the game has changed significantly, here's a few quick datapoints:

- Scores ending with 7, 5 and 2 are less common; scores ending with 0 and 9 are slightly more common. 7 was the least common; 9.1% of the scores (obviously 10% is the expected outcome) and 0 the most common, 10.8% of the scores. So it's still pretty even.

- Your opinion about the axis is correct; results where the difference in digits is 0 make up 6.7% of the results. The most likely are 2 points off (14.2%), followed by 1 point off (12.4%) and 4 points off (12.0%).

- The most common two-digit combo is 9-4 (e.g. 84-79, 109-104). It occurs 2.8% of the time (you'd expect 2.0%). The next most common are 0-2 (2.7%), followed by 0-6 and 4-6 (2.6% each). The 0-1 combination occurs 2.2% of the time. The dataset doesn't say who the underdogs and who the favourites are. The worst combo is 7-7 (0.8%), followed by 3-3 (0.9%).

- In a 68 game tournament, you would expect about 1.47 games to end 0-1; while more will end with underdog 0, favourite 1, I'd expect at least a third to go the other way (which could be Underdog 81, Favourite 80; but also Underdog 80, Favourite 91).

- If I had to pick, Favorite 2 Underdog 0 would be my choice. But you could do a hell of a lot worse than your choice.

posted by Homeboy Trouble at 12:19 AM on March 19, 2012

- Scores ending with 7, 5 and 2 are less common; scores ending with 0 and 9 are slightly more common. 7 was the least common; 9.1% of the scores (obviously 10% is the expected outcome) and 0 the most common, 10.8% of the scores. So it's still pretty even.

- Your opinion about the axis is correct; results where the difference in digits is 0 make up 6.7% of the results. The most likely are 2 points off (14.2%), followed by 1 point off (12.4%) and 4 points off (12.0%).

- The most common two-digit combo is 9-4 (e.g. 84-79, 109-104). It occurs 2.8% of the time (you'd expect 2.0%). The next most common are 0-2 (2.7%), followed by 0-6 and 4-6 (2.6% each). The 0-1 combination occurs 2.2% of the time. The dataset doesn't say who the underdogs and who the favourites are. The worst combo is 7-7 (0.8%), followed by 3-3 (0.9%).

- In a 68 game tournament, you would expect about 1.47 games to end 0-1; while more will end with underdog 0, favourite 1, I'd expect at least a third to go the other way (which could be Underdog 81, Favourite 80; but also Underdog 80, Favourite 91).

- If I had to pick, Favorite 2 Underdog 0 would be my choice. But you could do a hell of a lot worse than your choice.

posted by Homeboy Trouble at 12:19 AM on March 19, 2012

Thanks for running the numbers! I guess part of the offset in probability of the higher probability numbers you cited probably come from the identical digit sets - whose outlier status can be explained by the rules of the game.

I didn't doubt that the numbers would probably be fairly even, but was curious if any freaky discrepancies would arise. Looks like it's a fairly normal set of data. Thanks for your help!

posted by JoeGoblin at 1:20 PM on March 19, 2012

I didn't doubt that the numbers would probably be fairly even, but was curious if any freaky discrepancies would arise. Looks like it's a fairly normal set of data. Thanks for your help!

posted by JoeGoblin at 1:20 PM on March 19, 2012

This thread is closed to new comments.

posted by no regrets, coyote at 5:10 PM on March 18, 2012