How much can you know about a person with just 6 items?
March 25, 2011 8:02 AM   Subscribe

If I have 6 bits of information about a person (first name, gender, height, weight, US zip code, age), what sort of statistical information can I get and from what source? For example, I could possibly find out BMI in that person's area, etc
posted by JiffyQ to Health & Fitness (6 answers total)
do you mean about one person, or about a population of people? You can't get statistical information from a population of 1.
posted by empath at 8:07 AM on March 25, 2011

Response by poster: Statistical information about people who have the same sort of characteristics (again, only using those 6 bits of info)
posted by JiffyQ at 8:20 AM on March 25, 2011

The CDC conducts a nationwide annual survey titled the Behavioral Risk Factor Surveillance System (BRFSS) that provides such information. The SMART BRFSS Quick View looks especially useful.

About the BRFSS: it's essentially a LONG survey conducted over the phone that the CDC manages in all 50 states throughout the year. It asks questions related to health behaviors including fruit/veg consumption, smoking, alcohol consumption, etc. It also collects demographic data including sex, weight, height (BMI is calculated in the datasets), race, age, etc.
posted by hammerthyme at 8:37 AM on March 25, 2011 [1 favorite]

Your question is not entirely clear. If you are limited to those six bits of information, then the only thing you can do is combine them in different ways, and those could get into the realm of trivial or even silly. You mentioned "and from what source" - does that imply that you can get more information from another source?

Here are some (legitmate?) examples:
- are there any first names which have both male and female instances? That's a list of gender-neutral name.
- names compared to age can tell you about baby naming trends over the years
- studying weight per zip code can give you information about obesity and general health across an area. Zip codes are numbered to correlate to states, if you can get that data you can make statements about residents of a state. If you can get socio-economic data per zip code, you can correlate that.

And some trivial/silly examples:
- there are certain names that have a connotation of being overweight. Does your data support that? What names are most likely to be overweight/underweight?
- There are certain names that are associated with various ethnicities. Are there any trends related to either zip code or weight? (This relies on some huge assumptions, and may be considered offensive; I'm just throwing out the idea.)
- If you are building a website with this data behind it, you could let people type in their name and then tell them where they are most/least likely to live, their high/low/avg weight, height and age.
posted by CathyG at 9:01 AM on March 25, 2011

You should check out Latanya Sweeney's work on re-identification. I think ~85% of people could be uniquely identified based on gender, age (or maybe birth date), and zip code.

There are also databases you can buy or buy access to for things like mother's maiden name.

Alessandro Acquisti (a colleague of mine) did work on inferring social security numbers.
posted by jasonhong at 11:26 AM on March 25, 2011

Response by poster: Thanks for your responses- to clarify, as an intellectual exercise, I have these 6 bits of information on about 100 people. I wanted to see what else I could find out about them or some sort of socio-economic data related to them, and tell them about it.
posted by JiffyQ at 12:49 AM on March 26, 2011

« Older Let me have light!   |   What is the best way to word my persistent e-mail... Newer »
This thread is closed to new comments.