Help the stats-challenged understand an odds ratio
April 19, 2012 5:09 PM   Subscribe

Statistics make my head go wobbly. But I need to understand something about odds ratios. Actually, I don't need to understand -- I just need to make sure I'm not screwing up this particular figure. Please help!

Please treat me like I'm very, very dumb, because when it comes to statistics, I am.

I'm fact-checking an article that quotes a study which says a certain behavior is 124 percent more likely to occur in group A than group B. The copyeditor argued that this isn't right because you can't be more than 100 percent more likely to do anything -- once you are at 100 percent, you've done it.

I have nothing but respect for his copyediting skills, but I suspect he's wrong here. I don't quite have the language to express it, however.

Well...the study does in fact use the language "124 percent more likely." Going to the appendix of the study, I see that for this behavior, the odds ratio for this behavior, group A vs. group B, is 2.237. The percentage is 123.7%.

In my math-less life, I have never before encountered the idea of odds ratio. I did a little Googling, and I get the basic idea, but if I read more than a paragraph, my head starts to swim. Also, this warning from Language Log about using odds ratios in journalism makes me nervous -- but I can't really understand it either.

So, for people who actually understand all this, my questions:

1. Is there actually anything wrong (stats-wise or language-wise) with using the phrase "124 percent more likely"? If there isn't, how do I explain that to the copyeditor?
2. By repeating this figure, will I be making any of the errors that the Language Log post warns about?
3. What's the difference between "124% more likely" and "124% as likely"?
4. Is there a reason to use a percentage? Why can't one say "x times more likely"? And, um, the x here would be what? 2.237?

Please feel free to explain the basic ideas behind all this, but really the specific case here is what I'm concerned with.
posted by neroli to Writing & Language (15 answers total) 5 users marked this as a favorite
 
I would say that, if A has a 10% chance of happening and B has a 22.4% chance of happening, that B is 124% more likely than A. That is, the probability of B is 2.24 times the probability of A.
posted by monkeymadness at 5:23 PM on April 19, 2012


Best answer: I have no opinion on whether "124 percent" applies to the particular case that you're interested in. However:

(1) In and of itself, there's nothing wrong with "124% more likely", or with "(something over 100%) more likely" in general. For example, if you roll a standard, fair, six-sided die, you have about a 17% chance of rolling a one, and you have a 50% chance of rolling a one, two, or three. Rolling a one, two, or three is three times as likely, and two times more likely, than rolling a one. Two times more likely is 200% more likely.

Your copyeditor doesn't understand the "can't be over 100%" thought that is in his head, and is applying it to a situation to which it is inapplicable.

(2) I just looked at the first bit the article, so I don't know, but you definitely wouldn't be making the error described in the first bit, which is more along these lines:

(A) You have a 1 in 6 chance of rolling a 1, and a 5 in 6 chance of rolling something else, so you'll likely roll 20% as many ones as you will "something elses". (this is true)

(B) You have a 2 in 6 chance of rolling a 1 or 2, and a 4 in 6 chance of rolling something else, so you'll likely roll 50% as many ones and twos as you will "something elses". (this is also true)

(C) Take that 20% and divide by that 50% for no apparent reason, and announce to the world that you're 40% as likely to roll a one as you are to roll a one or a two. (this is completely mathematically unsound, and the conclusion it gives is totally false)

(3) If I were using those terms, "124% more likely" would be equivalent to "224% as likely". For example, something that is 100% more likely is twice as likely (i.e. 200% as likely). Another example, something that is exactly as likely - i.e. 0% more likely - is 100% as likely.

(4) No, there's no reason you'd have to use a percentage. One can say "x times more likely". In this case x would be 1.237, not 2.237; 2.237 would be "x times as likely".
posted by Flunkie at 5:37 PM on April 19, 2012


Best answer: 1. Is there actually anything wrong (stats-wise or language-wise) with using the phrase "124 percent more likely"? If there isn't, how do I explain that to the copyeditor?

No, there isn't anything wrong with that, absent any other context.

An illustrative example could be cost: If an iPhone costs $200, and an iPad costs $600, then the price of the iPad is 300% of the price iPhone. (i.e., price of iPad = Price of iPhone x 3).
If the iPad were twice as expensive as the iPhone, it'd be called 100% more expensive (=one iPhone more in cost). Since the iPad is triple the iPhone, it's called 200% more expensive (=two iPhones more in cost).

2. By repeating this figure, will I be making any of the errors that the Language Log post warns about?

From what you mentioned about the study appendix, it does sound like this error is being made. If you have access to the study, you can probably figure out the ratio of rates and report that, as encouraged to in the LL post.

(note: because, as LL discussed, ratios of odds are so not helpful, the rest of my answer answers your other questions assuming that "124% more" is a ratio of rates)

3. What's the difference between "124% more likely" and "124% as likely"?

iPad is 2x more expensive than iPhone. iPad is 3x as expensive as iPhone.

Also, nokia phone is 10% as expensive as iPhone means that it costs $20: 10% of $200 (you can't really use "x% less expensive than" without lots of extra context -- "as" can be used though)

4. Is there a reason to use a percentage? Why can't one say "x times more likely"? And, um, the x here would be what? 2.237?

Depending on the context, either of these could be the better choice. For instance, 5% more likely is easier to understand than 1.05x as likely (or, worse, .05x more likely). On the other hand, as above, it is easier to say that an iPad is 3x as expensive as an iPhone (or 2x more expensive) rather than 200% more expensive.

Here, the x would be 1.237. If you said "x times as likely", the x would be 2.237.
posted by milestogo at 5:37 PM on April 19, 2012 [1 favorite]


From what you mentioned about the study appendix, it does sound like this error is being made.
Sorry - in my answer I missed that you said "odds ratio". It may or may not be the error mentioned in the Language Log article, depending upon whether they're/you're using "odds ratio" in the same manner as that article is.
posted by Flunkie at 5:41 PM on April 19, 2012


Response by poster: Great, this is really helpful so far.

Part of my problem with the Language Log post is that I don't really understand the idea of "ratio of rates" as opposed to "ratio of odds." Still, I'm pretty sure I'm not dealing with anything I would think of as rates. The stat in question is the equivalent of "people who went to grad school are 124% more likely to own wine glasses than people who never finished college." (But obviously not actually that.)
posted by neroli at 6:07 PM on April 19, 2012


Best answer:
The stat in question is the equivalent of "people who went to grad school are 124% more likely to own wine glasses than people who never finished college."
If their numbers leading up to that are something like:
  • 45% of people who went to grad school own wine glasses
  • 20% of people who never finished college own wine glasses
  • 45% is about 2.2 times 20%
... then that "1.24" is right (or, right unless they did something else wrong).

On the other hand, if their numbers leading up to that are something like:
  • 45% of people who went to grad school own wine glasses
  • 55% of people who went to grad schoo do not own wine glasses
  • 45% is about 81% of 55%
  • 27% of people who never finished college own wine glasses
  • 73% of people who never finished college do not own wine glasses
  • 27% is about 37% of 73%
  • 81% is about 2.2 times 27%
... then they're making a mistake like the one described in the Language Log article.
posted by Flunkie at 6:22 PM on April 19, 2012


Ugh, and where I said (in the last step of the "they're wrong" example) "81% is about 2.2 times 27%", I meant "81% is about 2.2 times 37%".
posted by Flunkie at 6:24 PM on April 19, 2012


Response by poster: Wow, I can't thank you enough for taking the time to break this down for me. The light is dawning.

The study doesn't actually include raw numbers, so I can't confirm how they got their ratio, but I think this is the kind of thing that's on them. (The source is pretty solid.)

Just for my own curiosity, though...there's something I still don't get about your "wrong" example: the 81% and the 37% -- why would anyone calculate those? I mean, what are these figures measuring exactly? Glass-ownership parity?
posted by neroli at 7:00 PM on April 19, 2012


Best answer: The stat in question is the equivalent of "people who went to grad school are 124% more likely to own wine glasses than people who never finished college."

While there's nothing inherently wrong with odds ratios and expressions like "124% more likely", what you and your copyeditor have hit on is that they can be quite confusing. Doubly so given that they're often reported incorrectly, so you can never be sure what a particular piece actually means.

You(r authors) can completely avoid the whole problem by saying "10 out of every hundred people who didn't finish college own wine glasses, but 24 out of every hundred grad-school graduates do," or whatever the actual percentages are. No ambiguity there.
posted by ROU_Xenophobe at 7:06 PM on April 19, 2012


Best answer:
Just for my own curiosity, though...there's something I still don't get about your "wrong" example: the 81% and the 37% -- why would anyone calculate those?
The 81% and 37% examples could conceivably be useful things to know. For example, it's the same idea as saying "you're half as likely to roll either a 1 or a 2 than you are to roll a 3, 4, 5, or 6".
posted by Flunkie at 7:08 PM on April 19, 2012


Just for my own curiosity, though...there's something I still don't get about your "wrong" example: the 81% and the 37% -- why would anyone calculate those?

Adding to what Flunkie said... the "odds" (the 81% and 37% figures in that example) and "odds ratio" (the ratio of those two) are reported commonly in studies that use logistic regression. That's because the dependent variable in logistic regression is a log-odds, and the model coeffecients are log-odds-ratios. So yes, relative risks are more intuitive to non-statisticians, but reporting odds and odds ratios is the standard in certain kinds of studies.
posted by TSGlenn at 7:37 PM on April 19, 2012


If the iPad were twice as expensive as the iPhone, it'd be called 100% more expensive (=one iPhone more in cost). Since the iPad is triple the iPhone, it's called 200% more expensive (=two iPhones more in cost).

This is a distinction that many people will miss or misinterpret, and assume that when you say something is 200% more expensive you actually mean that it's twice as expensive. So if you're going to use this kind of language it's far clearer to stick with the unambiguous as and avoid more.

When you're quoting results from published research, though, you really do need to word it the same was as the research itself does unless you are, and have good reason to be, absolutely certain that what your new wording says is what the wording in the research means.
posted by flabdablet at 8:58 PM on April 19, 2012


Best answer: In the hopes that reading more than one explanation of this will help it sink in (it usually does for me), here goes my attempt at ratios of odds vs ratios of rates...

First you have to understand odds. When you talk about odds, you're talking about the chance that something WILL happen divided by the chance that it will NOT happen. So for instance, to use the numbers from Flunkie's example above, the chance a grad student owns a wine glass is 45%, so the chance a grad student does not own a wine glass is 55%, so the odds of a grad student owning a wine glass are 0.45/0.55 = 0.818.

An odds ratio, then, is the odds of a grad student owning a wine glass divided by the odds of a non-grad-student owning a wine glass. If 20% of non-grad-students own wine glasses, the odds of a non-grad-student owning a wine glass are 0.2/0.8=0.25. So the odds ratio is .818/.25 = 3.272. This means that the odds of a grad student owning a wine glass are 3.272 times greater than the odds of a non-grad-student owning a wine glass.

Crucially, it does NOT mean that the rate at which grad students own wine glasses (45%) is 3.272 times greater than the rate at which non-grad-students own wine glasses (20%). We started from knowing the rates in this case, so it's clear to us how wrong that interpretation would be (45 is not 3 times greater than 20). Making that very wrong interpretation, though, is the mistake that Language Log is calling out.

The odds ratio is a very useful thing to know if you're comfortable around statistics. But it turns out that people don't naturally think about things in terms of odds, they think about them in terms of rates. And even correctly stating that you're talking about odds does not prevent people from interpreting your statement as being about rates, because it's pretty normal to not actually know what odds are.

I understood from your question that you're editing an article meant for a general audience, which is reporting on a study published in a scientific article. It sounded a bit to me like Flunkie was suggesting that you might have to doubt the credibility of the statements in the original research paper. What Language Log is cautioning about in the post you linked is the case where a journalist looks at a published scientific study, sees the odds ratio, and puts the odds ratio into their piece with wording like "3.272 times more likely than". That wording will be interpreted by all normal readers as being a statement about a ratio of rates: instead of the rates being 45% and 20%, the readers will be imagining something like 65% and 20%. This is what they mean by "Thou shalt not report odds ratios" -- "If thou art a journalist reporting on a scientific study, thou shalt not say anything about the odds ratios you see in the study because there is no good way of getting a general audience to interpret them correctly." Language Log is not telling scientists not to report odds ratios in their research papers; odds ratios are a pretty basic statistical concept and if you have to doubt that the researchers got them right, you have way bigger things to doubt as well.

Also, please please please don't let that copyeditor impose his innumeracy on this article! Yikes. Yes, to claim that something is 124% more likely, as others have explained very competently, is an entirely reasonable statement. Talk about a little bit of knowledge being a dangerous thing...
posted by ootandaboot at 9:02 PM on April 19, 2012


TSGlenn is right about the odds ratio. If you need to fit a regression model where the outcome must be between zero and one, you need a logistic regression, which is the log of the odds ratio. For example, if I want to know what percentage of diabetic patients got their cholesterol checked last year, if I use linear regression, the model would let that percentage be 10, or -3, which is clearly incorrect. (Here is a case where a percentage truly cannot be more than 100%). So instead, you fit a model for the log odds ratio, which varies between zero and one, and all is well.

It is FAR less intuitive. But there are some study designs were you CANNOT determine the relative ratio, and you NEED to use the odds ratio (which approximates other measures of association). Look up "case-control study" and you'll see what I mean.

I can't tell whether your writers designed their study correctly or whether they are reporting the right number. My only point is that sometimes a study MUST use an odds ratio, and sometimes it is impossible to estimate the relative risk, though it is a hell of a lot harder for the average person to interpret.
posted by teragram at 3:20 AM on April 20, 2012


The stat in question is the equivalent of "people who went to grad school are 124% more likely to own wine glasses than people who never finished college."

My only comment here is that it's easy end up phrasing things in a way that can mislead people, even people who understand stats.

I would work on getting clear what exactly the facts are, and how to most simply and clearly express them. That might be saying outright something like: "While 88% of people who went to grad school own wine glasses, only 40% of those who never finished college do." i.e. Skip all the ratio and rate comparison stuff and lay out the underlying basic facts in a way that anyone can grasp. And incidentally that's probably a lot more useful to the reader, because if they care about who owns wine glasses at all, they probably care quite a lot whether it's 88% of people with grad degrees that have wine glasses or just 0.088% of them.
posted by philipy at 12:11 PM on April 20, 2012


« Older Mixed marriages   |   Can't find this video again. Argggh. (help?) Newer »
This thread is closed to new comments.