Is there a name for this logical fallacy? It has to do with statistics.
March 12, 2013 8:44 AM   Subscribe

The fallacy is assuming that statistic information about a thing is more relevant in dealing with a particular instance of that thing than available first-hand data.

The basic form of the fallacy is this:

Premise: I am in situation X.
Premise: Statistics show that X most commonly follows the pattern of situation Y.
Premise: I should base my actions on the available statistics.
Conclusion: I will react to situation X according to my rules for reacting to situation Y.

(Forgive me if I put that poorly; my formal logic is rusty.)

An example:
Jack is walking in the park when he sees two people walking dogs. One is walking a golden retriever, and the other is walking a pit bull. The golden retriever is straining at its owner's leash, barking at everyone who passes, growling, and baring its teeth. The pit bull is walking calmly by its owners side and letting people pet it without complaint. Jack has to pass by one of the dogs to continue. He recalls reading a very reliable and well-sourced study which said that pit bulls are 35% more likely than golden retrievers to be aggressive and dangerous towards strangers; therefore, in order to stay safe, Jack chooses to walk past the golden retriever, and avoid the pit bull.

Obviously, statistics have their use, and if Jack did not have first-hand data, his choice would have been logical. If, for instance, he didn't see the dogs, but was simply told by a friend that one path would lead past a golden retriever, and other path would lead past a pit bull, it would be logical to conclude that, given the limited information, there was a higher probability of meeting with an unfriendly dog if he chose to walk past the pit bull. But Jack should have realized that he had access to first-hand information (his witness of the dogs' behavior) that was more likely to represent the temperaments of those particular two dogs than the statistical average was.

Is there a name for this fallacy? Or a way of expressing it mathematically? The closest I can find is the ludic fallacy, but I suspect that there are better ways to express it.
posted by CustooFintel to Science & Nature (17 answers total) 3 users marked this as a favorite
 
Overfitting?

This is an iffy one.
posted by oceanjesse at 8:52 AM on March 12, 2013


The closest thing I can think of is that this is a failure to condition on total information, but that's a Bayesian way of understanding what's going wrong here.
posted by Jonathan Livengood at 8:55 AM on March 12, 2013 [3 favorites]


The problem you're running into is that formal logic doesn't usually cover uncertainty and statistics. Bayesian logic might be more appropriate as it expresses things in terms of prior probabilities.
posted by atrazine at 8:56 AM on March 12, 2013 [2 favorites]


actually - what you are suggesting is maybe the fallacy depending on the math.

Probability wise you are looking at two things

the likelihood of a pit bull attacking and the likelihood of a golden retriever attacking combined with the probability of a barking snarling dog attacking. If the probability of the last item is sufficiently high statistics will still tell you to avoid the Golden, even if Golden's in general are less likely to attack than Pit Bulls.

(ETA: This is Bayes stuff everyone up top is talking about)
posted by JPD at 9:00 AM on March 12, 2013 [2 favorites]


ETA: and even if the math still told you to walk by the Golden and it still attacked you, well - statistics provide a prediction, but there is always more than one outcome. Doesn't mean its a logical fallacy.
posted by JPD at 9:01 AM on March 12, 2013 [1 favorite]


I am neither a statistician nor a person trained in logic, but what strikes me is that he may be making the right decision if he were going to run into the situation enough times to make his own sample be statistically relevant. In an individual case, the mistake is in using the statistics at all. Only in the absence of additional information should he rely on the statistics. In this case there is additional information making a reliance on the stats not a good application of the law of large number statistics. I think the term would be inappropriately applied statistics.
posted by JohnnyGunn at 9:05 AM on March 12, 2013


I think the cure for a fallacy with statistics involved is - more statistical terms! I don't think it's a logical fallacy, per se, or I wouldn't call it that. I'd say it was an error in interpreting stats. I'd express it as: "Jack is responding to the probabilities for the population at large, not the set that's in front of him." Even in formal analysis of a data set, as opposed to this off-the-cuff example where frankly statistics aren't much help, you're supposed to do this, obviously. If I'm a TV exec in Memphis, or an advertiser in Memphis, I want to know the ratings for the NBC Nightly news IN MEMPHIS if I can get them, not nationwide. Or in Philly.

If there's a statistic for what % of dogs who are displaying very aggressive behavior go on to bite someone in the next ten minutes, I bet it's higher than the % of pit bulls who bite someone in any given ten minutes. So both statistically and in terms of common sense, Jack is an idiot.

And people do really do this all the time. If I'm an ad sales guy in Memphis and the NBC ratings are pants, I'll tout the national ratings. If local is better than national, of course I'll then talk about that.
posted by randomkeystrike at 9:07 AM on March 12, 2013


Best answer: I think you're talking about the ecological fallacy: deriving conclusions about an individual based on characteristics of the group to which he or she (or it!) belongs.
posted by iminurmefi at 9:16 AM on March 12, 2013 [9 favorites]


Yep, iminurmefi nailed it. I was thinking sets vs. population, but in your example that's perfect, and also a thing people do a lot.
posted by randomkeystrike at 9:19 AM on March 12, 2013


Also, fallacy of division.
posted by meese at 9:26 AM on March 12, 2013 [1 favorite]


Since you asked for a name, I'd say Jack is failing to update his Bayesian priors. Your dog scenario is such a good example of this I hope you don't mind if I steal it as an example of how prior knowledge (distribution of aggressiveness by breed) can be combined with knowledge of a particular sample (signals of aggressiveness in individuals) to make a decision.
posted by drdanger at 9:29 AM on March 12, 2013


I think you're talking about the ecological fallacy: deriving conclusions about an individual based on characteristics of the group to which he or she (or it!) belongs.

But wait a second - if all you know about an individual is that they are part of population - nothing else at all - assuming they resemble the mean member of the population is actually the correct assumption to make. It is quite likely a poor assumption, but given the information you have it is the best assumption you can make.

In the example in a post if you have two math classes with different means (large enough to be significant) the correct assumption is that a student in the higher scoring math class is likely to have a higher score than a student in a lower scoring class. There is a meaningful likelihood this is incorrect, but again, it is the best estimate you can make.
posted by JPD at 9:45 AM on March 12, 2013 [1 favorite]


JPD is right that this is more than the ecological fallacy. Jack isn't just ignoring the variability in populations by assuming that all golden retrievers are the same. He's also ignoring his personal measurement that this golden retriever differs from the mean in a particular (dangerous) direction.
posted by drdanger at 10:16 AM on March 12, 2013 [2 favorites]


I respond in these kinds of decision making situations informed by Dave Snowdon and Cynthia Kurtz's Cynefin framework.

I have sometimes described this fallacy as the "fallacy of retroactive coherence." It is similar to why investment pitches contain the words "past performance is no guarantee of future results." I use it a lot to help people refrain from making a decision making error whereby they confuse a known past with an unknowable future. In other words, applying technical fixes to complex problems can get you in trouble, especially if you assume that having data puts you on a safe platform.

If you know that 5% of retrievers attack and 65% of snarling dogs attack and 34% of snarling retrievers attack, it will tell you nothing about your current circumstance because the retriever may have rabies, be hungry and just been kicked by a guy wearing your hat. Except the retriever may also be restricted by a radio collar. Having data is helpful, but in a complex system, you must use a different decision making model.

Emergent properties make reliable prediction impossible, especially the more specific your situation gets. The way to move in this situation is to carefully probe the system and see what it does, but not at the risk of a catastrophic failure. When you discover a useful outcome ("dog is not biting") continue to carefully pursue that strategy. If the situation tips into chaos ("dog is now chasing me") the rules of the game change and you are in a different decision making realm, one where you must act by doing ANYTHING until the situation stabilizes.

Many people make life and death decisions like this every day and decision making based on this fallacy can get them killed. Police officers, soldiers and firefighters all fall into this category.
posted by salishsea at 10:39 AM on March 12, 2013 [1 favorite]


Wikipedia has a list of cognitive biases here.

The closest one on the list is anchoring -- the tendency to rely too heavily, or "anchor," on a past reference or on one trait or piece of information when making decisions -- although it seems to me that you would like to highlight the fact that the anchor is to *statistical* information.

There's a business cliche "you can't manage what you don't measure." Conversely, managers may be too inclined to focus on what they can measure, neglecting more important qualities that are harder to quantify. You see this issue come up in education debates about student and teacher performance, for example. I'm not aware of a name for this tendency however -- I might call it "quantification bias."

Of course your example assumes that personal observation is more critical than statistical information. There are times where this is also a fallacy ("more exposure effect" -- the tendency to express undue liking for things merely because of familiarity with them; or "neglect of probability").
posted by leopard at 10:46 AM on March 12, 2013


Closely related, if not exactly what you're after: The Median isn't the Message by Stephen Jay Gould (concerning his own probability of surviving cancer).
posted by pont at 11:12 AM on March 12, 2013


This reminds me of one criticism of Evidence Based Medicine. Here one is applying information about a population while neglecting information about the specific instance at hand.
posted by neutralmojo at 3:17 PM on March 12, 2013


« Older Bro's guardian has revoked consent; how do I speak...   |   Eating disorder recovery Newer »
This thread is closed to new comments.