# Probability and Truth

April 12, 2008 12:59 PM Subscribe

*Suppose you take a test for a rare type of cancer that affects 0.01 percent of the population. The test is 98 percent reliable. You get a positive reading. What are the chances you have the cancer?*I read this probability puzzle today and the writer said the statistical chances of you having the cancer in this scenario are less than half a percent. I don't get it. Isn't the rarity factor irrelevant compared with the test reliability? Please explain.

In short, the chances are higher that the test is incorrect. The NY Times explains a similar problem here.posted by jessamyn at 1:02 PM on April 12, 2008

What you want to look at is Baye's Theorem. In particular, look at example 2

posted by chndrcks at 1:03 PM on April 12, 2008

posted by chndrcks at 1:03 PM on April 12, 2008

I heard this on a TED talk awhile ago, and I remember it had something to do with whether or not the test was more likely to report false positives or false negatives. The rarity factor is still relevant. Something to do with if you get a positive than there can be either one of two situations: either you belong to that 0.01% of the population who is susceptible to the cancer, or you have received a false positive on the test. A false positive is more likely.

I don't know, I'm just talking out of my ass, trying to remember a statistical dicussion from a podcast over a year ago...

posted by arcticwoman at 1:05 PM on April 12, 2008

I don't know, I'm just talking out of my ass, trying to remember a statistical dicussion from a podcast over a year ago...

posted by arcticwoman at 1:05 PM on April 12, 2008

On preview, Jessamyn's answer is correct. This one's just a little more worked out.

Okay, assuming 98 percent reliable means 2% of the time the test gives the wrong answer (IRL tests don't necessarily give the same rate of false positives and negatives but whatever,) then roughly 2% of the people test positive, while only 0.01 percent (and only about 1/2 of 1% of the people testing positive) actually have cancer. The numbers I gave above are approximations but close enough to make the point.

Now say the cancer occurs in 1% of the population. We'll work the problem beyond the approximation level this time. Let's look at expectations.

Say I give the test to 1000 people. Here's what I'd expect:

N: 1000, Occurrence of cancer = 1%, Test reliability = 98%

People without cancer: 990

People without cancer, testing negative: 0.98*990 = 970.2

People without cancer, testing positive: 0.02*990 = 19.8

People with cancer: 10

People with cancer, testing negative: 0.02 * 10 = 0.2

People without cancer, testing positive: 0.98 * 10 = 9.8

So in this case, we'd expect 19.8 + 9.8 = 29.6 positives on average. Odds are still that you don't have cancer if you test positive, but in this case the odds are much worse: 9.8/29.6 or about 1/3 that you really have cancer, 2/3 you don't.

So, rarity definitely makes a difference. If this still isn't clear, work the problem out as above only with the original numbers.

posted by Opposite George at 1:18 PM on April 12, 2008

Okay, assuming 98 percent reliable means 2% of the time the test gives the wrong answer (IRL tests don't necessarily give the same rate of false positives and negatives but whatever,) then roughly 2% of the people test positive, while only 0.01 percent (and only about 1/2 of 1% of the people testing positive) actually have cancer. The numbers I gave above are approximations but close enough to make the point.

Now say the cancer occurs in 1% of the population. We'll work the problem beyond the approximation level this time. Let's look at expectations.

Say I give the test to 1000 people. Here's what I'd expect:

N: 1000, Occurrence of cancer = 1%, Test reliability = 98%

People without cancer: 990

People without cancer, testing negative: 0.98*990 = 970.2

People without cancer, testing positive: 0.02*990 = 19.8

People with cancer: 10

People with cancer, testing negative: 0.02 * 10 = 0.2

People without cancer, testing positive: 0.98 * 10 = 9.8

So in this case, we'd expect 19.8 + 9.8 = 29.6 positives on average. Odds are still that you don't have cancer if you test positive, but in this case the odds are much worse: 9.8/29.6 or about 1/3 that you really have cancer, 2/3 you don't.

So, rarity definitely makes a difference. If this still isn't clear, work the problem out as above only with the original numbers.

posted by Opposite George at 1:18 PM on April 12, 2008

Here's the Peter Donnelly talk at TED that refers to, among other things, this. An excellent talk.

posted by JakeWalker at 1:32 PM on April 12, 2008

posted by JakeWalker at 1:32 PM on April 12, 2008

I found the wikipedia page on Bayesian probability, linked earlier, pretty helpful in understanding this kind of problem, if you work through it step by step. The key for me to understand this intuitively is that, yes, the rarity does matter. Because if the subject population of noncancerous folks is very very large - in other words, if the cancer is very rare - then even a test that seems fairly accurate (98%) is going to produce a pretty large number of false positives compared to accurate positives.

posted by chinston at 1:34 PM on April 12, 2008

posted by chinston at 1:34 PM on April 12, 2008

Looks good to me.

posted by Opposite George at 1:50 PM on April 12, 2008

posted by Opposite George at 1:50 PM on April 12, 2008

Best answer: I'll try and make this more intuitive.

The test comes back positive. This means that either you have cancer and the test is right, or you don't have cancer and the test is wrong. How can you figure this out?

The short answer is you can compare the size of the two groups: people with cancer and a correct positive, and people with no cancer and a false positive.

The people with cancer and a true positive are 98% (the accuracy) times 0.01% (the percent who have cancer). This is .000098 , or .098% of the population.

The people with no cancer and a false positive are 2% (the error rate) time 99.99% (the percent without cancer). This is .019998, or 1.9998% of the population.

Now, if you had to bet, would you bet you were in the .098%, or in the 1.9998%? Or to put it differently, would you bet you're in Group A, or Group B if you know that Group B is about 20 times larger? Obviously, you'd bet on Group B. The odds that you're in Group A or B are determined by the relative sizes of Groups A and B. In this case, the odds that you have cancer are about 1 in 20.

Nope. The driving factor is just that 2% of almost everyone is a much larger group than 98% of almost nobody.

posted by ROU_Xenophobe at 1:50 PM on April 12, 2008 [6 favorites]

The test comes back positive. This means that either you have cancer and the test is right, or you don't have cancer and the test is wrong. How can you figure this out?

The short answer is you can compare the size of the two groups: people with cancer and a correct positive, and people with no cancer and a false positive.

The people with cancer and a true positive are 98% (the accuracy) times 0.01% (the percent who have cancer). This is .000098 , or .098% of the population.

The people with no cancer and a false positive are 2% (the error rate) time 99.99% (the percent without cancer). This is .019998, or 1.9998% of the population.

Now, if you had to bet, would you bet you were in the .098%, or in the 1.9998%? Or to put it differently, would you bet you're in Group A, or Group B if you know that Group B is about 20 times larger? Obviously, you'd bet on Group B. The odds that you're in Group A or B are determined by the relative sizes of Groups A and B. In this case, the odds that you have cancer are about 1 in 20.

*I remember it had something to do with whether or not the test was more likely to report false positives or false negatives*Nope. The driving factor is just that 2% of almost everyone is a much larger group than 98% of almost nobody.

posted by ROU_Xenophobe at 1:50 PM on April 12, 2008 [6 favorites]

I recall reading a story once about how a bunch of doctors who were asked a question just like this were overwhelmingly unable to answer it correctly. It's tricky stuff!

But one thing to remember in all this is that there is valuable information in a positive result. Your probability of having cancer is much higher with a positive result than a negative. From ROU's work above, about 1/20 rather than 1/10,000.

So an alarmist would say that you have a 500x (!!!) greater chance of having cancer if you get a positive result, even if that positive result is likely to be a false positive.

posted by Hello, Revelers! I am Captain Lavender! at 2:21 PM on April 12, 2008

But one thing to remember in all this is that there is valuable information in a positive result. Your probability of having cancer is much higher with a positive result than a negative. From ROU's work above, about 1/20 rather than 1/10,000.

So an alarmist would say that you have a 500x (!!!) greater chance of having cancer if you get a positive result, even if that positive result is likely to be a false positive.

posted by Hello, Revelers! I am Captain Lavender! at 2:21 PM on April 12, 2008

But why did you order the test in the first place - because you suspect there to be a problem.

What is the probability that you have a

posted by porpoise at 2:48 PM on April 12, 2008 [1 favorite]

What is the probability that you have a

*different*kind of cancer than the really rare kind?posted by porpoise at 2:48 PM on April 12, 2008 [1 favorite]

"I recall reading a story once about how a bunch of doctors who were asked a question just like this were overwhelmingly unable to answer it correctly. It's tricky stuff!"

Daniel Kahneman won a Nobel Prize for work broadly related to that study you are talking about. Amos Tversky would have won as well, but he had passed away by the time the committee got around to them.

Its also hugely influential in the area of Behavioral Finance.

posted by JPD at 3:04 PM on April 12, 2008

Daniel Kahneman won a Nobel Prize for work broadly related to that study you are talking about. Amos Tversky would have won as well, but he had passed away by the time the committee got around to them.

Its also hugely influential in the area of Behavioral Finance.

posted by JPD at 3:04 PM on April 12, 2008

**JPD**writes

*"Amos Tversky would have won as well, but he had passed away by the time the committee got around to them."*

Of cancer. No, really. Of metastatic melanoma, a rarer form of skin cancer that nevertheless causes the majority of skin cancer deaths.

posted by orthogonality at 3:50 PM on April 12, 2008

This goes once again to how poorly a lot of scientists do with complex statistics. Jessamyn's got it right, dead-on. See also the problem with psychologists and the Monty Hall Problem for another case where researchers misunderstood the statistics behind a test.

posted by Leon-arto at 4:00 PM on April 12, 2008

posted by Leon-arto at 4:00 PM on April 12, 2008

This is what finally helped me understand Bayes.

Warning - lots of java applets - page will hang your browser for 30 seconds.

posted by dmd at 4:16 PM on April 12, 2008

Warning - lots of java applets - page will hang your browser for 30 seconds.

posted by dmd at 4:16 PM on April 12, 2008

Looking again, it appears I misplaced a decimal point -- curse google calculator and its near-immediate switch to exponential notation -- and that about 1 in 200 people who test positive actually have the cancer. This corresponds to the 1/2% the original poster stated.

Or because everybody is supposed to have the test at age X.

The other fun way to imagine Bayesian updating is with crime. Imagine the justice system as a filter that picks people up and sorts them into innocent, who go free, and guilty who get sent to prison. The pool of people who did a particular crime is very, very, small, and the population of people the system examines is large. So even if the system is highly accurate, we'd still expect a significant proportion of prisoners to be "false positives," ie innocent of that particular crime.

posted by ROU_Xenophobe at 5:24 PM on April 12, 2008

*But why did you order the test in the first place - because you suspect there to be a problem.*Or because everybody is supposed to have the test at age X.

The other fun way to imagine Bayesian updating is with crime. Imagine the justice system as a filter that picks people up and sorts them into innocent, who go free, and guilty who get sent to prison. The pool of people who did a particular crime is very, very, small, and the population of people the system examines is large. So even if the system is highly accurate, we'd still expect a significant proportion of prisoners to be "false positives," ie innocent of that particular crime.

posted by ROU_Xenophobe at 5:24 PM on April 12, 2008

If Yudkowsky's Bayes explanation two posts up works for you, he posts religiously (and verbosely) on Overcoming Bias on similar themes. We as a species are

posted by Skorgu at 5:48 PM on April 12, 2008

**stunningly incompetent**ad judging probabilities.posted by Skorgu at 5:48 PM on April 12, 2008

ROU_Xenophobe's explanation is excellent. The way the original question is worded, though, is confusing - obviously intentionally on the part of the questioner, who understands the issue.

Tests are not commonly spoken of as having "reliability," because that is an ambiguous phrase. A doctor wants to rely on a test to make a diagnosis, but there are different factors to consider.

One thing to consider is how good is the test itself. Presented with a patient who 100%, for-sure, no lie, is a positive case, how often will the test show a positive result when performed on that patient? This percentage, which is an intrinsic property of the test, is called the

Also, presented with a patient who 100%, for-sure, pure-d, does NOT have the disease in question, how often will the test show a negative result? That is called the

Another thing to consider is, if the test shows positive, what is the probability that the patient actually has the disease in question - in other words, what is the

As a doctor using a test to make a diagnosis, I "rely" on my understanding of the Predictive Value of a Positive test to guide my decision making. For instance, the CSF test for anti-borrelial antibodies, called Lyme titer, has a false positive rate of about 10% - its specificity is only about 90%. Combine that with an extremely low population prevalence of disseminated neuroborreliosis - it is almost unheard of, maybe 1 per 100,000 - and you will see that a positive CSF Lyme titer carries

As a dude in a lab trying to design a better test for Baxter Healthcare Corporation to bring to market, I "rely" on improving the

This was taught to me, and others, in medical school; I don't think doctors are so ignorant of these issues as most of the gloaters in this thread would like to make out. Patients, on the other hand, are in my experience almost universally ignorant of these issues, and most of them show no wish or capability to be educated about it.

posted by ikkyu2 at 6:17 PM on April 12, 2008

Tests are not commonly spoken of as having "reliability," because that is an ambiguous phrase. A doctor wants to rely on a test to make a diagnosis, but there are different factors to consider.

One thing to consider is how good is the test itself. Presented with a patient who 100%, for-sure, no lie, is a positive case, how often will the test show a positive result when performed on that patient? This percentage, which is an intrinsic property of the test, is called the

**sensitivity**of that test for making that diagnosis.Also, presented with a patient who 100%, for-sure, pure-d, does NOT have the disease in question, how often will the test show a negative result? That is called the

**specificity**of the test. Specificity is also an intrinsic property of the test.Another thing to consider is, if the test shows positive, what is the probability that the patient actually has the disease in question - in other words, what is the

**Predictive Value of a Positive test**(PVP)? This is the question that the original poster is asking. As it turns out, the predictive value of a positive test, performed on a member of a given population, depends on the**population prevalence**of the disease under study. Same thing is true about the PVN - predictive value of a negative test. ROU_X explained why quite eloquently and about as simply as it's possible to do.As a doctor using a test to make a diagnosis, I "rely" on my understanding of the Predictive Value of a Positive test to guide my decision making. For instance, the CSF test for anti-borrelial antibodies, called Lyme titer, has a false positive rate of about 10% - its specificity is only about 90%. Combine that with an extremely low population prevalence of disseminated neuroborreliosis - it is almost unheard of, maybe 1 per 100,000 - and you will see that a positive CSF Lyme titer carries

**no weight**in my medical decision making, because it's so much more likely to be a false positive than a true positive that it's worthless even to consider it.As a dude in a lab trying to design a better test for Baxter Healthcare Corporation to bring to market, I "rely" on improving the

**sensitivity**and**specificity**of a test. Improving the specificity of a test will always improve the predictive value of a positive test, regardless of the population prevalence. Likewise, improving the sensitivity of the test will improve the predictive value of a negative test. (It may not improve enough to make the test clinically useful!)This was taught to me, and others, in medical school; I don't think doctors are so ignorant of these issues as most of the gloaters in this thread would like to make out. Patients, on the other hand, are in my experience almost universally ignorant of these issues, and most of them show no wish or capability to be educated about it.

posted by ikkyu2 at 6:17 PM on April 12, 2008

*I don't think doctors are so ignorant of these issues as most of the gloaters in this thread would like to make out.*

There are studies about this; the most relevant is Eddy 1982. Quoting from another study,

He asked American physicians to estimate the probability that a woman had breast cancer given a positive screening mammogram and provided them with the relevant information: a base rate of 1 percent, a sensitivity of 80 percent, and a false-positive rate of 9.6 percent. Approximately 95 out of 100 physicians wrongly reckoned this probability to be around 75 percent, whereas the correct answer is 7.7 percent

A more recent study in Germany (that I was quoting from above, Hoffrage and Gigerenzer 1998) found that the modal answer to the probability of cancer given a positive mammogram was 90%, and report similar findings when they tested gynecologists and AIDS workers (presumably with things other than mammograms).

But when the question is presented with frequencies instead of in probability language, more people get it right because then it's easy to see that the positive|cancer group is much smaller than the positive|no-cancer.

Not a slam on physicians, for what it's worth. Whatever they found, the proportion of physicians who get it right is surely vastly higher than the population proportion. I only happen to know about these studies because I use them in introductory methodology courses to try to slam the point home that you need to DO. THE. FUCKING. NUMBERS. because human intuition is very, very bad at probability.

posted by ROU_Xenophobe at 8:58 PM on April 12, 2008

*For instance, the CSF test for anti-borrelial antibodies, called Lyme titer, has a false positive rate of about 10% - its specificity is only about 90%. Combine that with an extremely low population prevalence of disseminated neuroborreliosis - it is almost unheard of, maybe 1 per 100,000 - and you will see that a positive CSF Lyme titer carries no weight in my medical decision making, because it's so much more likely to be a false positive than a true positive that it's worthless even to consider it.*

But 1/100000 is only relevant if you're screening everybody without any suspicion of having disseminated neuroborreliosis. If you have some other reason to suspect that the patient might have disseminated neuroborreliosis, then you start with an informative prior instead of a null prior, and the test might be useful.

posted by ROU_Xenophobe at 9:07 PM on April 12, 2008

Sure, ROU_Xenophobe. The symptoms of neuroborreliosis are wonderfully specific things like fatigue, gait imbalance, dizziness, sore backs, headaches, and stiff necks. You can just pick these folks out of the crowd - imagine having one of those rare symptoms! Crazy, isn't it?

posted by ikkyu2 at 10:07 PM on April 12, 2008

posted by ikkyu2 at 10:07 PM on April 12, 2008

Response by poster: Thanks very much to you all for your excellent answers. Yes, intuition on such matters is misleading.

posted by binturong at 11:34 PM on April 12, 2008

posted by binturong at 11:34 PM on April 12, 2008

This thread is closed to new comments.