Article on the shortcomings of corpus analysis?
December 29, 2024 12:26 PM   Subscribe

I'm trying to find an article I read years ago on the shortcomings of corpus analysis as a way to gain insights about people/thinking in the past.

As I recall it was a web 1.0 type page. I remember it discussing examples of analyzing the work of specific authors and of things like historical newspaper publications. So, specific bodies of work. The basic gist was that through word frequency alone you can't come to meaningful conclusions about what had value to people at that time.

And to be clear, I'm not talking about linguistic analysis.

I'm happy to get any articles on the topic, but the one I remember was written in very straightforward layperson's terms.

The reason I'm after it is to share it with a community that discusses a specific body of religious texts. People frequently fall back on arguments of "word frequency == importance".

And if it is appropriate I'm also more than happy to hear thoughts on the topic as well.
posted by Senescence to Writing & Language (1 answer total) 5 users marked this as a favorite
 
I don't have a link for you -- would 'survivor bias" be important keywords for the skew caused by random luck preserving some examples as well as individual curation focusing the surviving texts in alignment with political or social mores of the times these texts came through?
posted by k3ninho at 9:08 AM on December 31 [1 favorite]


« Older Daruma Dilemma   |   What are your favorite soups to make ahead and... Newer »

You are not logged in, either login or create an account to post comments