Ideas for tag cloud analysis
March 11, 2014 4:15 PM
I have 600 images tagged with keywords. In addition to nice pictures of tag clouds and frequency calculations, what sort of smart, insightful analysis can I do with these data that could reveal relationships between the tags and other (more formal) data attached to the images? Any advice on software tools (Windows, preferably FOSS) would be appreciated. I have some training in statistics and I have actually done textual statistics before but only briefly so I'm not familiar with the current tools and methods.
Is there any context for these images
We're trying to analyze how a certain topic is presented in textbooks. We collected the relevant pictures in the books and tagged them ourselves (i.e. "cat" if the image is showing a cat). So there's no EXIF, timestamp or other automatic metadata (we did not even scan the pics) but there are formal metadata such as the publisher, the grade and the main subject matter (geography, science...). Tag clouds and word frequencies are good enough, but I was wondering what sort of more sophisticated analysis could be done (notably about cooccurences and relations with metadata). For instance, if we were interested in cats, we'd like to know how cats are pictured in grade X textbooks vs grade Y. Since I've posted this question I've played with WordyUp, which is quite fun to use but is a little bit of a black box.
posted by elgilito at 9:00 AM on March 12, 2014
We're trying to analyze how a certain topic is presented in textbooks. We collected the relevant pictures in the books and tagged them ourselves (i.e. "cat" if the image is showing a cat). So there's no EXIF, timestamp or other automatic metadata (we did not even scan the pics) but there are formal metadata such as the publisher, the grade and the main subject matter (geography, science...). Tag clouds and word frequencies are good enough, but I was wondering what sort of more sophisticated analysis could be done (notably about cooccurences and relations with metadata). For instance, if we were interested in cats, we'd like to know how cats are pictured in grade X textbooks vs grade Y. Since I've posted this question I've played with WordyUp, which is quite fun to use but is a little bit of a black box.
posted by elgilito at 9:00 AM on March 12, 2014
This thread is closed to new comments.
Are the photos geotagged? Could throw them on a map. And I assume they're timestamped, so you could have a timeline as well.
But beyond that -- did you just get 600 random images from flickr and set yourself an exercise to do this? Is there any context for these images / this task -- who tagged these images: machine, the image creators + friends, mechanical turk? Who is the audience for this analysis?
Are you interested at all in using the images themselves, or exclusively in the EXIF and tag metadata?
posted by batter_my_heart at 10:22 PM on March 11, 2014