Analyzing a large corpus of email data
July 26, 2013 3:12 PM Subscribe
I'm working on project for class which involves looking at a large corpus of email data for patterns in Gephi. Any patterns that we can find are fine, so long as we can justify them and back them up with qualitative analysis. This is my first time doing analysis on this scale, and I'm not entirely sure where to start. I've run a few different layout algorithms on it, and had the best results with Force Atlas 2, I've filtered out the nodes with 1 out-degree, and I've ranked node sizes by betweeness-centrallity. The graph is directed, with the edges being sized according to their weight (determined by number of mails sent), so a lot of the layout plugins I've been finding won't work (as they're tailored for undirected graphs). Is there anything obvious that I'm missing that might make for a compelling visualization, or show interesting connections in the network?