# Covid-19 testing, statistics, and time

June 16, 2020 10:54 AM Subscribe

Has anyone proposed a method whereby your "true" odds of having Covid-19 can be determined given the rates of false-negative/false-positive results for a particular test?

I know three people whose doctors said "yeah, you've got it". All three recovered, and all three later tested negative (two of them using the same IgG test, third I'm not sure of).

It's been established that the test has a very significant error rate, which fluctuates between roughly 15% to nearly 70% over time, relative to the apparent infection date. Given this, how many tests would a person need to take in order to have some statistical certainty that they either did, or did not, have Covid-19 at a point in the past?

For the purpose of this question, let us assume the infection dates are reasonably knowable (pretty rapid onset), but I'm not averse to running the algo multiple times using different dates.

The ideal result from all this would be something like "If you are tested X times using the IgG test, the majority of the result will be the correct answer Y% of the time."

It may very well be that no such answer is yet possible; I'm mainly curious given the experiences of people I know, and some light reading in the Annals of Internal Medicine which looked at test error rates over time (the tests appear to be mostly appallingly bad), which made me wonder if multiple testing and ensuing analysis could override the error rate.

I know three people whose doctors said "yeah, you've got it". All three recovered, and all three later tested negative (two of them using the same IgG test, third I'm not sure of).

It's been established that the test has a very significant error rate, which fluctuates between roughly 15% to nearly 70% over time, relative to the apparent infection date. Given this, how many tests would a person need to take in order to have some statistical certainty that they either did, or did not, have Covid-19 at a point in the past?

For the purpose of this question, let us assume the infection dates are reasonably knowable (pretty rapid onset), but I'm not averse to running the algo multiple times using different dates.

The ideal result from all this would be something like "If you are tested X times using the IgG test, the majority of the result will be the correct answer Y% of the time."

It may very well be that no such answer is yet possible; I'm mainly curious given the experiences of people I know, and some light reading in the Annals of Internal Medicine which looked at test error rates over time (the tests appear to be mostly appallingly bad), which made me wonder if multiple testing and ensuing analysis could override the error rate.

This is exactly what Bayes theorem and likelihood ratios are for, but I don't think you have all the information you need to answer your questions. To know about the chances of a

Adding in multiple tests to this Bayesian equation has a number of complex dimensions. First, your question presupposes that the antibody tests are independent events. But that's not necessarily true. A person who has antibodies that aren't detected by one test (false negative) may be more likely to have those antibodies fail to be detected in a second or subsequent test, such that taking additional tests won't change your posterior odds (the odds have having the disease post-test) much (i.e., it's not telling you any additional information). Or perhaps the tests all fail for the reasons under similar scenarios, etc. If you can find out that the tests are independent, then its just a matter of calculating a likelihood ratio for each test and adding them together, then multiplying by the prior odds (the odds of having the disease pre-test). If they are dependent in some way, the math gets more complicated, but the general through line is the same. There's a ton on Bayes online, not all of it good, but this article has a good example of this kind of medical math.

posted by MeadowlarkMaude at 11:54 AM on June 16, 2020 [12 favorites]

*particular*test result being a true positive or true negative, then you need to know not only the sensitivity and specificity of the test, but the base rate in the population, and if you want to be even more accurate, more information about the particular person you are discussing. If Person A is a prison doctor living in New York City (high base rates, high risk individual), a positive antibody test is more likely to be correct than for Person B, who lives in a low base rate city and has low exposure, even if they are taking the same test with the same sensitivity and specificity rate. Counterintuitively to many people, if the base rates are very low, the probability of a positive test being a true positive is also very low, even if you have a very sensitive and specific test. Conversely, if base rates are low, a negative called test is more likely to be a true negative.Adding in multiple tests to this Bayesian equation has a number of complex dimensions. First, your question presupposes that the antibody tests are independent events. But that's not necessarily true. A person who has antibodies that aren't detected by one test (false negative) may be more likely to have those antibodies fail to be detected in a second or subsequent test, such that taking additional tests won't change your posterior odds (the odds have having the disease post-test) much (i.e., it's not telling you any additional information). Or perhaps the tests all fail for the reasons under similar scenarios, etc. If you can find out that the tests are independent, then its just a matter of calculating a likelihood ratio for each test and adding them together, then multiplying by the prior odds (the odds of having the disease pre-test). If they are dependent in some way, the math gets more complicated, but the general through line is the same. There's a ton on Bayes online, not all of it good, but this article has a good example of this kind of medical math.

posted by MeadowlarkMaude at 11:54 AM on June 16, 2020 [12 favorites]

3blue1brown gives an explanation of Bayes theorem if you want something a bit more visual.

MeadowlarkMaude is right on point. The usual medical Bayes explanation is usually breast cancer, or some other medical problem that we've studied long enough to know the base rate of entire populations and different groups of base risk factors. That's needed to fill in the whole negative-negative, negative-positive, positive-negative, positive-positive table of outcomes.

posted by zengargoyle at 3:56 PM on June 16, 2020

MeadowlarkMaude is right on point. The usual medical Bayes explanation is usually breast cancer, or some other medical problem that we've studied long enough to know the base rate of entire populations and different groups of base risk factors. That's needed to fill in the whole negative-negative, negative-positive, positive-negative, positive-positive table of outcomes.

posted by zengargoyle at 3:56 PM on June 16, 2020

Ah, I knew there was one about medical screening: Can We Trust Maths? - with Kit Yates. The first part is medical and screening for breast cancer.

posted by zengargoyle at 4:27 PM on June 16, 2020

posted by zengargoyle at 4:27 PM on June 16, 2020

The FDA actually provides a handy-dandy calculator on their page to estimate the Positive Predictive Value and Negative Predictive Value given the sensitivity and specificity of a given test. On this page , scroll until you come to this paragraph:

There should be a link to an Excel spreadsheet from the word calculator which will do what you are asking for (and yes, it does look like an intern put it together in an afternoon's work).

posted by peacheater at 6:34 PM on June 16, 2020 [2 favorites]

*Always refer to the complete instructions for use to put these estimates into the proper context and to understand how to use and interpret these tests. FDA also is providing a calculator that will allow users to see the estimated performance of a single test or two independent tests based on their performance characteristics and the estimated prevalence of SARS-CoV-2 antibodies in the target population.*There should be a link to an Excel spreadsheet from the word calculator which will do what you are asking for (and yes, it does look like an intern put it together in an afternoon's work).

posted by peacheater at 6:34 PM on June 16, 2020 [2 favorites]

Yes, like peacheater mentions, you'll want to calculate the positive predictive value and the negative predictive value. These ask the questions:

1. If I test positive for the test, what is the probability that I truly have the disease?

2. If I test negative for the test, what is the probability that I truly don't have the disease?

posted by stripesandplaid at 5:55 PM on June 17, 2020

1. If I test positive for the test, what is the probability that I truly have the disease?

2. If I test negative for the test, what is the probability that I truly don't have the disease?

posted by stripesandplaid at 5:55 PM on June 17, 2020

« Older Recommendations for customized posters? | Print resource for parents on kid depression? Newer »

You are not logged in, either login or create an account to post comments

But as the wise ones say: Garbage in, garbage out.

posted by basalganglia at 11:30 AM on June 16, 2020 [2 favorites]