Textual Analysis Primer/Examples?
June 20, 2008 9:12 PM
Looking for a general background/framework within which to define the more technical aspects of a textual analysis project, as well as interesting examples of computer-driven textual analysis.
I am participating in a research project on textual analysis of historical architectural magazine articles.
We'll be inputting the titles and maybe text of articles from architectural magazines from 1930 to 1960 into a database and looking for patterns.
I am wholly ignorant of textual- or content-analysis, and am sure there's a great deal of literature and interesting analyses that other smarter people have already thought of.
I'm naively thinking word frequency, adjacency and correlation (i.e.: what percentage of articles that mention 'le corbusier' also mention 'modernism' in as given year/magazine), to figure out links between articles. etc. This could spin of into graph-theory measurements.
Our interest is in the actual content, keywords, etc., not so much in sentence structure, grammar, parsers, etc.
I program in Python, mostly, and am in charge of the 'hard' numerical part of the project (there's 3 architectural history geeks in charge of the soft conceptual part).
Books are nice, but something online might be quicker.
As we are architects, interesting visualizations are always welcome.
I am participating in a research project on textual analysis of historical architectural magazine articles.
We'll be inputting the titles and maybe text of articles from architectural magazines from 1930 to 1960 into a database and looking for patterns.
I am wholly ignorant of textual- or content-analysis, and am sure there's a great deal of literature and interesting analyses that other smarter people have already thought of.
I'm naively thinking word frequency, adjacency and correlation (i.e.: what percentage of articles that mention 'le corbusier' also mention 'modernism' in as given year/magazine), to figure out links between articles. etc. This could spin of into graph-theory measurements.
Our interest is in the actual content, keywords, etc., not so much in sentence structure, grammar, parsers, etc.
I program in Python, mostly, and am in charge of the 'hard' numerical part of the project (there's 3 architectural history geeks in charge of the soft conceptual part).
Books are nice, but something online might be quicker.
As we are architects, interesting visualizations are always welcome.
This thread is closed to new comments.
http://people.ucsc.edu/~wsack/ (WARNING: evil window resizing)
http://hybrid.ucsc.edu/SocialComputingLab/projects.htm (no window resizing)
posted by shortfuse at 6:23 AM on June 21, 2008