I'm designing a survey with the aim of finding out how inclusive a particular process is at work. This survey will go out to about 1000 people, who are based all around the world. I've got most of the survey done, it's the ethnic origin question that is causing me headaches.

Designing this question is really complex, as there is no real consensus on what an ethnic group is. Looking at national guidelines, each country has it's own set of categories for collecting statistical data on ethnic origin. Compiling them into one drop down list will be very confusing, and have a lot of overlap as these categories are not distinct. By leaving it out we will miss important information about the inclusiveness of our process. By leaving it a blank box to be filled in, we will get many different answers that are difficult to get useful information from. It's also important that I am able to compare these categories to other data to make them meaningful.

I've got various individual country category lists that follow that countries census data, and have logic in the question that would send the person filling in the survey to the ethnic origin list of their country, but having done this for a few countries doing it for the entire list of world countries seems exhaustive. Anyone have any experience in designing this sort of survey and can point me in the right direction?
I'm in the process of designing a global survey right now, and just had this conversation with our external survey vendor, as I was worried our initial list of responses was too US-centric (while our population of respondents will be very US heavy, we anticipate getting replies from at least 20 countries). Here's our current list of response options, and respondents can select as many as they would like:

Black or African American
Hispanic or Latino
East Asian
South Asian
Southeast Asian
Middle Eastern
Native Hawaiian or Other Pacific Islander
Make it a free response box and then take the time after to go through the answers you get and compile them into categories that are useful to you. People will write (for example) "White" and "Caucasian," and you'll have to put them into a category together before you run the data. But that's the best way to get real answers from everyone and also have categories that are meaningful and useful for your data--do it manually afterward.
This is not my area of expertise, but thinking more about what "inclusion" means (or what your precise research question is) will probably be helpful. For instance, NotMyselfRightNow's list doesn't work if you're concerned that you're failing to serve minority populations in regions X and Y. On the other hand, it's totally possible that those coarse categories coupled with location gives you sufficient granularity to answer your question.
It depends on what you are interested in. If, for example, you were interested in how diverse local hiring is (are they hiring local minorities), you could ask something like this:

"Do you identify as an ethnic minority in your current location? If so, how do you identify"

But if what you want to ask is what primary regional ancestry a person has, you could ask:

"From what region of the world do your ancestors originally come from?

- Europe
- Africa
- South Asia
- East Asia
- West or Central Asia
- North America
- South America
- South-east Asia
etc etc"

But these questions are always fraught - and questions of race are complicated by how the schema is culturally dependent. You can't export US racial categories to Asia, for example, and expect people to understand them.
None of the suggestions to date include Native American/Alaskan Natives, or indeed account for any of the indiginous populations in other nations. Hoyland's concern about failing to serve minority populations doesn't work in countries where it's the majority population that is under-served and faces barriers.

To do it country by country will be an enormous task, as there are 18 government-recommended ethnic groups for England, but they are different in Wales, and different in Scotland, and different in Northern Ireland. You will be doing this forever.

My suggestion would be to leave it open and then prepare to manually harmonize and aggregate the data. How the UK does this is explained here and it's a helpful approach IMHO.
The risk with presenting a narrowed list of ethnic categories to a very broad audience is that they may not understand them and consequently may select them inaccurately.


A good friend of mine is ethnically English/Welsh and Chinese, and he is first generation Canadian. In a University class he was asked to complete a multiple choice demographic survey that included a question about his ethnicity, and for the first time in his life he was presented with multiple types of "Asian" to choose from: East-Asian, South-Asian, and some others that I don't remember. He was confused and, figuring that Hong Kong was in the South of China, chose South-Asian, which by the definitions the survey givers were using he is not. In the circles we grew up in Asian meant Chinese/Korean/Vietnamese/etc, South-Asian was unknown, and if you wanted to refer to Indians from India you said East-Indian to differentiate them from regular Indians (a sub-group of Indigenous Canadians/Americans). (Language is messy!)

Which groups are considered significant and worth documenting varies so broadly by region too. I look at NotMyselfRightNow's list and it immediately jumps out at me that, with the exception of Native Hawaiian, all Indigenous populations (for North & South America, Australia, New Zealand, etc, etc) are lumped in under/with Other, rendering them rather invisible.
The last Canadian census had 279 ethnic origins/ancestries with forty percent of people choosing multiple ethnicities (noting that because I hope you are giving people the option of choosing as many ethnicities as they wish). It seems a little weird to me that on a survey specifically on diversity you want to put people in small boxes to make the survey easier to manipulate. Is your ultimate goal to have useable data or easy data? See the Canadian list, their sortation, and their footnotes rationale for making the choices they did here A huge flaw in the Canadian list is that individual Nations and Tribes are not specified, but the census release DID say how many self identified as Creee, Rama Nation etc so it must have been a write-in option.
Aiming for a single global set may be over-constraining. Per Secret Sparrow, lots of our understanding of and names for ethnic categories are themselves local cultural constructs.

When we do research where we're interested in these types of questions, we work with local vendors per country to develop lists (which they tend to have on file). National censuses are a good starting point for this if you're doing it solo. They're often regressive and slow to reflect modern norms, but they do represent a consensus view of some sort about how that culture wants to talk about the people who live there. Plus, they will be lists that people in those countries will have seen before and decided how to represent themselves even if it's not ideal.
How do you define "inclusive"?

If you want a long list of ethnicities so you can say "look how inclusive we are" then many of these suggestions are great, probably.

But, if you want to statistically analyze disproportionate impact on some groups over others, than this is the wrong way of going about it. The thinner you slice your data, the more evidence you will find of impact.
