Graphical Display to Show Two-Variable Two-State Each Changes Over time
June 8, 2018 1:43 PM   Subscribe

I have longitudinal data. I have a number of variables that are things may you did at time 1 and maybe you didn't, and maybe you did at time 2 and maybe you didn't. I would like a nice way to visualize how people moved between these two states that would be easy to read and extract information from.

I'm picturing something with time 1 on the left, time 2 on the right and the vertical axis representing how many people?

I would like to be able to look at the diagram and see

1. How many people did the thing at time 1
2. How many people did the thing at time 2.
3. How many people who did the thing at time 1 still do it at time 2 vs stopped.
4. How many people who did the thing at time 2 were new and how many were already doing it.

I assume that goals 1 and 2 will require that the columns be sorted by who did and didn't at each time, and yet goals 3 and 4 seems to require that people hold their positions/line up horizontally, right?

Assume everyone is there at both time periods (no new people, no missing people). I have about 12 things people may or may not have done at each time period, so I assume I will need seperate figures for each, but if you have some magical way of turning it all into one figure I'm open to that (seriously don't htink it's possible though).

Colour is fine.

For bonus points: Any cool graphical way to show how people move from doing X things at time 1 to Y things at time two were X and Y are numbers? So how many people did 1 thing, 2 things, 3 things etc. at time 1, how many people did 1 thing, 2 things, 3 things, at time 2, and how many people who did 1 thing (2 things, 3 things) at time 1 do 1 thing, 2 things, 3 things at time 2?

Forgive any typos. I'm at work and leechblock is coming for me in 45 seconds.
posted by If only I had a penguin... to Grab Bag (8 answers total) 2 users marked this as a favorite
 
Response by poster: Just to be clear, though I'm picturing time 1 on the left time 2 on the right, etc. I'm open to other kinds of display (yes, I can work around leechblock by using Chrome. I probably should have been making the post from Chrome in the first place, I just didn't think of it and then panicked when I got the warning).

Oh, and the number of people is a few hundred, but I don't need to literally display every person, obviously.
posted by If only I had a penguin... at 1:51 PM on June 8, 2018


I'm not sure if this is helpful, but you can always think about this as having more than two states by redefining the states. So having done activity A in period 1 but not period 2 is state 1. Having done activity A in both periods 1 and 2 is state 2. Having done it in period 2 but not A is state 3, and not doing it in either period is 4.

Then of your goals, 3 and 4 are immediate. 1 is just adding up the number of people in (expanded) states 1 and 2, etc. Do this for each activity.
posted by dismas at 2:05 PM on June 8, 2018


Best answer: I can imagine this as a Sankey diagram. You can have that with crossovers represented with transparency.
posted by Wrinkled Stumpskin at 2:11 PM on June 8, 2018 [1 favorite]


Seconding Sankey diagram. This is how most web analytics information is presented, and it's really similar to what you've described.

There are a bunch of entry pages (activities that may be done at Time 1).
You want to see how many people get to your site from each page, and how many of them go to each linked page (activities that may be done at Time 2).

In fact, you might be able to find some kind of open source analytics graphing tools that would expect data in a "user", "URL", "timestamp" format. You could massage your data to be in that format and just hit Go.
posted by hammurderer at 2:38 PM on June 8, 2018


Response by poster: Thank you, Sankey diagrams were exactly what I was picturing but I couldn't picture them clearly enough to describe them.
posted by If only I had a penguin... at 6:44 PM on June 8, 2018


Response by poster: If someone who knows a little more HTML than me could help me out...I'm trying to make these using Google charts. I have made a basic Sankey chart with my data, but I want to customize the colours and the titles. I don't understand where i put the customization code in my file. Do I just stick it wherever before the draw command or somewhere else?
posted by If only I had a penguin... at 7:55 PM on June 8, 2018


Near the middle of the first code example on the page you linked, it has this block:
// Sets chart options.
var options = {
width: 600,
};
You'd put the other customizations before that, as in the second example on that page (no need to bold it, I'm just highlighting it here):

var colors = ['#a6cee3', '#b2df8a', '#fb9a99', '#fdbf6f',
'#cab2d6', '#ffff99', '#1f78b4', '#33a02c'];


var options = (whatever else you want, like the width and height, but make sure you have the same number of {curly brackets} and a semi-colon at the end of the options block.)
posted by AFABulous at 8:54 AM on June 9, 2018


I have about 12 things people may or may not have done at each time period, so I assume I will need seperate figures for each, but if you have some magical way of turning it all into one figure I'm open to that (seriously don't htink it's possible though).

I've drawn a skech showing one possible design. As there are 12 activities, the matrix is potentially very large, but in practice it might be sparse, with no people doing many combinations of activities.
posted by James Scott-Brown at 4:05 AM on June 10, 2018


« Older Share upcoming divorce news on social media?   |   Help this displaced homemaker figure out what to... Newer »
This thread is closed to new comments.