What is the name of this kind of statistic?
September 1, 2022 8:20 AM   Subscribe

There's a kind of data I have seen and would like to see more examples of, but I don't know how to search for it because I don't know what it's called, assuming it is called anything. Basically, it is a way of determining the most popular distinctive thing in a given place/population.

So, as an example, I recall seeing a map of each US state with the most popular fast food chain in that state. But it wasn't, like, every state is just McDonalds or Chick-Fil-A. They had done some kind of analysis to figure out which States liked a specific chain that wasn't necessarily popular (or even available) nationally. Perhaps by determining which fast food chain had a popularity that was the furthest from its national popularity average or something? Or by picking their most popular chain that wasn't the most popular in any other state?

I recall seeing something similar for WW sports (and would love to find this one again) where if you take away things like Soccer/Football and Cricket that are massively popular around the world, what are you left with as each nation's distinctive obsession sport?

Does this type of analysis have a name? Or is there another way to find charts or maps with data like this?
posted by jacquilynne to Grab Bag (9 answers total) 3 users marked this as a favorite
 
Maybe location quotient?
posted by catquas at 8:33 AM on September 1, 2022 [1 favorite]


I think what you'd want to know is if something overindexes with a group / population.

For instance, looking at the population, maybe there's an average of 1 subway for every 1,000 people. But, oh, here's oregon, they have 1 subway for every 500 people! Oregon would overindex with subways compared to the rest of the country.

So, it takes having good data to determine things like overindexing / underindexing. For things like sports, maybe you could use numbers of sporting events, but will they be collected the same way across different countries?

So, while you mention "popularity" it's really important to tie that to something that's measurable. Maybe it's google search trends. Maybe it's attendance that you feel good about measuring. Maybe it's streaming statistics.

But, in general, you want to know what sports/foods/shows/things overindex with a specific population.
posted by bbqturtle at 8:41 AM on September 1, 2022 [5 favorites]


If you could find the maps you're talking about, that would be helpful. this fast food map is just absolute popularity, although the source it pulls from has different data, and the source that source pulls from has different data as well, so uh, yeah. But in any case, it's not just all McDonalds and Chick-Fil-A. It'd be useful to hear the purported explanation of what the data is.

I'm not aware of a name for analysis like this (although I could just not know about it), I think maybe because it's hard to do in a principled way — like, you can compare popularity relative to the national average, but I think actually interpreting data in that form in a reasonable way is pretty hard.

I searched Google Images for "relative popularity map", which didn't get much useful. Searching for "popularity compared to national average" mostly isn't useful, but does find this map of sport popularity (warning: facebook link) that is what you're interested in.
posted by wesleyac at 8:45 AM on September 1, 2022


Response by poster: Just to clarify: I don't want to do any of this analysis myself, so I'm not married to any particular measure of popularity. I just want to see things other people have created that fall in this vein. Rigorous accuracy in their work is also not required -- this is purely a personal interest, not data I am going to use for any specific purpose.
posted by jacquilynne at 8:53 AM on September 1, 2022


Well you know, XKCD's State Word Map...
posted by brainmouse at 8:55 AM on September 1, 2022 [4 favorites]


Best answer: HuffPo and Yelp call this "disproportionately popular" in this article about cuisines per state, which also shows the process (highest options above national average). It does allow for repeated results but that's probably fair.

That phrase comes back with all sorts of charts:
posted by Nonsteroidal Anti-Inflammatory Drug at 9:04 AM on September 1, 2022 [2 favorites]


I just want to see things other people have created that fall in this vein.

The management book First Break All The Rules is based on a very large survey of managers and employees aimed at this question:
What do great managers do that good and okay managers don't do?
I found it a very worthwhile read.
posted by gauche at 10:21 AM on September 1, 2022


For languages, you can look at the second or third most widely spoken language in a given region (assuming English is generally first and Spanish is second in most US states).
posted by nouvelle-personne at 10:28 AM on September 1, 2022


A statistic that is used for this kind of thing is tf-idf (term frequency-inverse document frequency):
The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general.
I think some version of this is used by Amazon to report "statistically improbable phrases" (SIPs) that distinguish a book from others. These are phrases that appear rarely in general but frequently in the given book.
posted by grobstein at 10:58 AM on September 1, 2022 [6 favorites]


« Older AI image improvement services   |   Yet another mask question Newer »
This thread is closed to new comments.