How do I create accurate Venn diagrams?
June 19, 2004 2:55 PM   Subscribe

I need to create venn diagrams, in color, in sizes corresponding to data. I don't know what software to use. [MI2]

Say there are two websites, and I know the traffic for each website, and also how it overlaps. And I have to be able to show visually: "Look, the number of monthly visitors to site A is the blue circle. The number of monthly visitors to site B is the yellow circle. These circles are scaled so that their relative size suggests the difference in their numbers of monthly visitors, one from the other. Here in the green area, where the two circles overlap, we have the people who go to BOTH sites. And the size of the green area is, of course, also proportional to the numbers in question."

Ideally, I would create it in either a Linux application (I use Debian testing) or something for windows XP that would be both free and possible to install without being an admin. Maybe there is a way to do this in Excel or Powerpoint (both of which I have use of), but I've never seen a venn diagram in one of those.

If all else fails, I could also live with ideas of other ways to represent this kind of thing visually. But it has to be visual, it has to be simple, and it can't be just a bar graph.

Obviously, I could just draw circles in gimp, but I want them to be true to scale for the data I'm talking about.
posted by bingo to Computers & Internet (11 answers total)
 
If you're open to doing the math, just draw circles in gimp, after doing some calculations:

(1) calculate the ratio k between dataset A & B

(2) Let's say B is bigger, with a radius of R (and A has a radius of R). So to get the radius of each circle, you're going to have:

A = pi*r*r and Ak = pi*R*R

(3) So now, solving, you'd get r & R by:

r = sqrt(A/pi) R = sqrt(Ak/pi)

You could probably make things simpler at this point by picking some convenient value of A as a reference point (like, pi inches or something) since it's the scale/ratio that matters.

Then, just use some transparency on one of the circles.
posted by weston at 4:14 PM on June 19, 2004


On preview: Ugh. Weston beat me to it.

I might suggest using OpenOffice Draw to do the artwork, since it's (a) free, (b) available for both Linux and PC, and (c) a bit simpler than the Gimp. You could also setup a simple OO Calc document to do the calculations for you.
posted by SPrintF at 4:20 PM on June 19, 2004


It's not quite so easy, weston; he also needs to know something like the distance between the centers of the circles, in order to get the correct area of intersection.

I can write out the math for you if you think it will be useful, bingo. Maybe you could work a little magic with GIMP's script-fu in order to autogenerate the images.
posted by Galvatron at 4:21 PM on June 19, 2004


Response by poster: Galvatron: If you go to the trouble of doing it, I will do my best to make use of what you give me.

In other news, I don't know anything about scripting Gimp, either.
posted by bingo at 4:40 PM on June 19, 2004


Fat tasty equations.

I've never played with php + gd before, so I'll have a look. Back in a bit.
posted by Flat Feet Pete at 5:08 PM on June 19, 2004


Here you go. Unless I'm missing a simplification, you'll need a numerical solver to get the distance between the circle centers; Octave has an 'fsolve' routine that should work.

Neat problem. Someone want to double-check the math?

As far as GIMP scripting goes, I wouldn't worry about it unless you want to autogenerate a huge number of images. Just do it by hand--GIMP has rulers in the margins, so you can measure out the proper distances.
posted by Galvatron at 5:38 PM on June 19, 2004


Here you go:
http://www.flatfeetpete.com/~peter/pie/pie.php?A=600&B=200&AB=30&S=10
Fiddle the url to get what you want. S is scale. Source here. My biggest issue was the version of php I had. It would draw arcs but not ellipses.

It's worth remembering there's a load of odd stuff about how people perceive areas, you might need to square the numbers.

Galvatron: I had a look at the pdf, and it seemed to make sense, although I wouldn't say I had the full on maths-fu to certify it correct.
posted by Flat Feet Pete at 6:59 PM on June 19, 2004


Response by poster: Galvatron, that's beautiful. Really. I'm very grateful.

However, I am a two-time liberal arts major. I can vaguely recognize all the elements of your equations from a business calculus class (or was it trig) that I took about fifteen years ago, but I have little hope of being able to use them now. Maybe I can persuade someone to put all that into a script of some sort (and I just mean the math, not the graphics)...or, I guess, I could use a scientific calculator, though I have a feeling I would screw it up somehow.

On preview: um, I'd better look at what Pete did now.
posted by bingo at 7:03 PM on June 19, 2004


Response by poster: Flat Feet Pete: You rock, but I have to be able to color in the circles and print it out. Probably, it would end up pasted into an excel or word file, but at the very least I have to be able to put labels on it. I guess I could do a screenshot and then gimp it, but is there some other way?

Coolness coolness coolness.
posted by bingo at 7:07 PM on June 19, 2004


Nifty little bisection solver, Pete.

Here's a MATLAB/Octave script. Save it somewhere, then run 'octave' in the same directory. If website A has 10000 hits, B has 50000 hits, and 5000 hits are shared, then do something like this:

octave:1> [ra, rb, dist] = venn(10000, 50000, 5000, 1000)

ra = 1.7841
rb = 3.9894
dist = 3.8433

The fourth argument to venn() is an arbitrary scaling parameter that determines the diagram size. The resulting ra, rb, and dist are radius of first circle, second circle, and the distance between the centers, respectively.
posted by Galvatron at 7:48 PM on June 19, 2004


Response by poster: Thanks Galvatron. I will definitely put all this to use. Expect a follow-up or an email at some point. Many thanks to all.
posted by bingo at 7:21 PM on June 23, 2004


« Older Kissing   |   Photo Software Newer »
This thread is closed to new comments.