Gameshow Inspired Statistics Question
January 21, 2015 2:15 PM   Subscribe

I recently came up with a statistical thought experiment from watching too many reality game shows, but I'm having trouble remembering how to solve a problem like this. Help me figure out this problem that's been bugging me (and by extension relearn some statistics I've forgotten).

Say I have a competition where I rate a group of people of sample size n from a population of size P on some particular skill (best chef, best singer, most attractive, etc). From this rating, I'm able to find the best person in the sample. How many people do I need to have in my sample to be 95% certain I've also found the best person in my overall population?

The main difficulty for me (despite having totally forgotten how to do this problem since I last took a statistics class way long ago) is that I don't know the distribution or standard deviation of scores in the population. I assume I could derive that from my sample, assuming my sample was randomly selected. But what if it wasn't? What if it were an audition where everyone in the sample believed they had a shot at being the best?

How do I solve this and how could I compensate for a non-random sample?
posted by fremen to Science & Nature (4 answers total)
This is very close to, but not exactly, what's usually called the "secretary problem". I believe it doesn't make any difference what the distribution or standard deviation is. Googling around about that may help you answer this.
posted by brainmouse at 2:22 PM on January 21, 2015 [2 favorites]

How many people do I need to have in my sample to be 95% certain I've also found the best person in my overall population?

As stated, this is just the probability that said person is in your sample, so n/P.
posted by PMdixon at 2:30 PM on January 21, 2015 [6 favorites]

Don't you use the sample population sigma as an estimate of the overall population sigma? Then calculate your population mean using the central limit theorem and then calculate your confidence interval.

Khan academy video on confid intervals.
posted by St. Peepsburg at 3:29 PM on January 21, 2015

Best answer: She doesn't want the mean, she wants the max. The word escapes me right now, but there's a word in statistics for an estimator that grows more precise (that is, the confidence interval shrinks) as the sample gets larger. Means are such estimators. Maximums are not.

Means and maximums are different, because every unit you sample tells you something about the mean. With maximums, only one unit (the largest in your sample) tells you anything about the max. It's at least that big. The central limit theorem doesn't apply to maximums. The confidence interval tells you the probability that the mean falls within that range, not the probability that the max does.

I think the probability that you sampled it is the right answer. So solve for n such that n/P>=.95
posted by If only I had a penguin... at 3:45 PM on January 21, 2015 [5 favorites]

« Older What's Going on With My Heartburn?   |   My spouse has a terminal diagnosis. What do I need... Newer »
This thread is closed to new comments.