What can I do with this data?
December 1, 2015 2:10 PM   Subscribe

I volunteer at an NGO that provides legal aid to asylum seekers. I'm trying to create graphs that will help us better visualize the population we serve and perhaps make connections that we haven't before. Our data points are: age, sex, nationality, arrival date in our service, length of asylum procedure, and outcome of procedure (refusal or acceptance). We have this information from every person who has used our service. What interesting and useful things can I do with this data?

Please note that I have no experience in data analysis!
posted by Blissful to Science & Nature (10 answers total) 6 users marked this as a favorite
 
You could get a trial of Tableau and try throwing all the data in there to see what it comes up with.
posted by the agents of KAOS at 2:34 PM on December 1, 2015 [1 favorite]


Is there a question you want to find an answer to?

Visualizing your demographics is fairly easy, the challenging thing is to decide what to do with it.

Are there different categories of asylum procedures? Do you have their origin points and where those who got through ended up?
posted by canine epigram at 2:35 PM on December 1, 2015


Do the basics. Plot age vs. length of procedure, and change the symbol plotted based on gender or outcome. Plot boxplots of age for each nationality. Make a chart of the four values showing amount of people binned into outcome vs gender. However, you should do some statistical tests to see if what looks like a difference is really not.

Tableau is cool, but it does cost a grand after the free trial is over.
posted by demiurge at 2:48 PM on December 1, 2015 [1 favorite]


Our data points are: age, sex, nationality, arrival date in our service, length of asylum procedure, and outcome of procedure (refusal or acceptance).

Start by asking your coworkers questions - and then ask questions of the data. To me, the obvious ones are:

What is the demographic breakdown of your clients? [describing your client pool, using the data]
What is the acceptance rate, and how has it changed over time?
Do any of the age, sex, nationality, or arrival date variables affect the outcomes (length of procedure, and refusal/acceptance?)

One simple approach is to do very basic data analysis in Excel (like, calculating averages - simple enough - though it also has some more advanced statistical calculations), have it generate charts, and then send those charts to a designer to pretty up ... or use Google charting, which looks a little bit nicer than Excel's raw charts.
posted by entropone at 4:40 PM on December 1, 2015 [1 favorite]


We have just finished a project with a great volunteer group Datakind who looooove this kind of work, and were in our experience really good to work with - organised, focused and relatively fast to answer and complete, and very good at explaining and working with us to find solutions that helped our project goals. They got us set up on a project that in 6-8 months will generate a bunch of interesting data at which point they will come back and do a bunch of data analysis with/for us. Explore working with them or getting advice from them - data analysis is their jam. They have chapters in most major cities.
posted by dorothyisunderwood at 5:01 PM on December 1, 2015 [3 favorites]


I work for Tableau.

Tableau can be free for non-profits through the Tableau Foundation.

Also, Tableau Public is free to use for everyone, as long as you're willing to have to have your data and visualizations publicly accessible.
posted by ShooBoo at 5:50 PM on December 1, 2015 [2 favorites]


Hi, professional migration researcher here. (I should say I work on the transit side rather than the asylum side, but I have lots of professional contacts with US immigration lawyers who specialize in asylum claims from Latin America.)

I'm confused why you want to graph/make data visualizations with this information. Is it because those things seem sexy? I say this because most of the connections that seem most obviously useful don't make for good visuals: correlations of sex/nationality/outcome then put into age "buckets," for instance, would be very interesting as a first pass for the folks I know, but make for a crap graph.

The other important point that got left out is what format your data is in. The format actually makes a ton of difference in terms of what you'll be able to easily draw out of the data you have, since this seems very clearly like a side project. In other words, you probably don't have a lot of capacity for data cleaning. I would guess (?) that since you're a nonprofit we're talking something jury-rigged and being used apart from its best purpose, like an Excel spreadsheet, a Filemaker Pro database, MS Access, something like that.

If this *really is* just a side project, the best thing you can do is make cross-tabulations (sometimes called contingency tables). Folks who want you to do significance testing at this stage are, I'm fairly sure, not trained in survey methods; what you're looking for isn't causation but actually something more like demography. Again, this is *not* data vis. But cross-tabs allow you at a glance to see some patterns across who you're serving in more specific ways than are often obvious even to practitioners―e.g. you may realize it's a lot of Iraqi men, but not that the majority of successful cases are of 28-32 year-olds―and adjust your services accordingly. It may also let you go to your funders with more persuasive data for why they should give you money this week/month/year. But the data visualizations come after this, not before.

Finally, the segmentations of the data that you make will *absolutely* affect your results. So be wary, and ground-truth the data will what you know from the everyday operations of the organization.
posted by migrantology at 9:22 PM on December 1, 2015 [2 favorites]


I will agree with almost everything migrantology said, especially the caveats about your choices of how to bucket affecting your results.

When you have a binary indicator (like acceptance/rejection), a heat map can be a useful way to quickly see patterns across a pair of variables, as well as outliers from these patterns. All this is really doing is converting the cross-tabs to a color scale, but for taking a first, super coarse grained, pass at things it's often a much quicker way to see things than by comparing individual percentages to each other basically one by one. (They can be useful for scalar dependent variables as well but oftentimes you spend more time getting the range of the color mapping to a useful place than it would take you to glean the same high level stuff from the raw numbers)

It won't be pretty, but it's quick to make a table of numbers into a heat map in Excel using conditional formatting.
posted by PMdixon at 1:16 AM on December 2, 2015


The DKAN Drupal distribution is specifically designed for open-data applications and has visualization tools built in. You can spin up a free instance at Pantheon.
posted by COD at 6:28 AM on December 2, 2015


The key first step for a project like this is figuring out your audience. As an ignorant outsider, I would guess there are a few major options:

  • Internal decision-makers, who have questions about the state of the org. The might be curious about time-of-year patterns or changing demographics over time, or kinds of applicants who are particularly high risk.
  • Existing donors, who you might want to demonstrate organizational impact to using data.
  • General audience. You might be able to get blog coverage with a particularly catchy and topical argument using your data that would attract attention for your organization.
  • Migration/asylum experts, ala migrantology.
Given an audience, your goals will shift pretty dramatically. I would start by figuring out which audience you're most interested in working for and adapting your analysis to match. Internal audiences or donors are probably the most friendly audiences and easiest to talk to and understand their interests so I might start there.

Good luck!
posted by heresiarch at 9:24 AM on December 2, 2015


« Older And you thought the Grinch would never pay $5.   |   Movie ID: Group needs to escape through... Newer »
This thread is closed to new comments.