Assessing Political Leanings in Written Word
July 14, 2020 1:38 PM   Subscribe

Are there any available tools that analyze written content to measure the political leaning of the writing? For general example, analyzing an op-ed and placing it on a spectrum of very conservative - conservative - middle - liberal - very liberal?
posted by NotMyselfRightNow to Computers & Internet (9 answers total) 2 users marked this as a favorite
 
This chart is helpful.

But even then many newspapers print op-eds from varying perspectives.

Mostly you need to familiarize yourself with the beliefs held by people in various parts of the spectrum. Here are some places to start.

Wikipedia

Information is Beautiful

The Conversation

I'm sure others will provide more.
posted by mareli at 2:08 PM on July 14, 2020


None of this is my thing and I can't remember cites, but:

Yeah, these tools exist, but probably not in the way that you mean. There are tools out there that can look at how different people or organizations offer different streams of words and analyze the patterns of who says what and who says other things to estimate ideal points in some indeterminate hyperspace. I know someone has done this with speeches in Congress, but I forget who. Anyway, if you had those tools, probably built for R, you could feed it your op-ed and if it was long enough and the data gods more generally smiled on you, it might assign the author an ideal point, say, between Biden and Harris.

Are there tools that will look at an op-ed and actually analyze the meaning of the words to tell you that it's liberal? Oh, hell no, not for ages and ages. The software that did that could likely stake out a reasonable claim of being alive.
posted by GCU Sweet and Full of Grace at 2:29 PM on July 14, 2020 [2 favorites]


I am not a trained natural-language processing aficionado, but I believe this would be a subset of "sentiment analysis," and one of the things about that is that you have to set it up by tagging language with the attributes you want to analyze. For instance, if you wanted to analyze whether some piece of text was angry or happy, you'd have to have a list of sorts that categorized words into "happy" and "mad." Not every possible word, but "enough" (an exercise left for the practicioner) to be able to differentiate the moods in text. The software would then comb text for these words, apply the criteria you've given it, and give you a mad/glad score.

So, for political leanings I imagine this would be pretty hard. You'd have to have a lexicon of "words leftys tend to use" and vice-versa for the right. You can get some help from partisans who seem to go out of their way to use their own terms for things (Fox News hosts), to the degree that I think some people actively shape their language to avoid terms used by the opposing party. Anyway, you gotta be able to tell from the words, just by reading, so that what you're doing is automating the stuff you might feel you know intuitively. You can probably also seed some AI-type process with a list of words that it then uses to extrapolate and expand the lexicon for the parties/sides/etc.
posted by rhizome at 2:44 PM on July 14, 2020 [1 favorite]


Response by poster: mareli, I appreciate the response, but you're misunderstood the question. I want to analyze individual pieces of content, not entire publications.

GCU and rhizome, you've both pointed out the challenges of this. My assumption is that no one has solved it - or even gotten in the ballpark of it - but I am hoping to be proven wrong!
posted by NotMyselfRightNow at 3:00 PM on July 14, 2020


Looks like this isn't a turnkey product, at least on the individual consumer level, but it's definitely been done - there are quite a few scholarly articles that come up for "political sentiment analysis", such as this one, which uses Twitter as a corpus.
posted by sagc at 3:08 PM on July 14, 2020


It’s likely any automated tool would be *very* unreliable for this, at any level of complexity, because the base assumptions are so difficult to establish. This exact task, analysing written material and assessing its political implications, is late high school to university level work and generally leads to discussions that aren’t resolved either way. Like GCU above said this is a task that’s beyond a lot of human readers.

The late Christopher Hitchens was a good example—he had extremely firm political opinions, but they swayed from left to right, and by the end of his life had hardened into a pro-war reactionary. The language he couched them in didn’t alter, though, and he still wrote in terms of liberalism, equality, class, and human rights, and so on.

The language around socialism in the US is an even better example of how difficult this is, when even quite intelligent people find it so hard to define socialism (is it labour camps, Stalin, and nationalised industry, or merely the fact of income tax?) and capitalism (is it the exact rapacious system of finance we live in now, or is it merely the acts of buying and selling?)
posted by Fiasco da Gama at 3:21 PM on July 14, 2020 [1 favorite]


I was just researching this for something similar, I'm pretty sure there's no out-of-the-box tool to do this. I wouldn't be surprised if someone is working on a machine-learning based tool right now and hasn't published yet. Something like GPT-3 that is trained on a massive corpus of web content could probably be manipulated to do this, but as far as I know it hasn't been done.

One existing solution that might work well with some tweaks is LIWC, which is often used in psychology and is word-frequency based. I haven't used it myself but the manual actually has some political examples so it seems like something you could set up to be semi-automatic.
posted by JZig at 5:43 PM on July 14, 2020


DICTION is somewhat similar to LIWC, and is used a lot (and was created) by political communication scholars. It isn’t going to spit out political spectrum for the text, but there are plenty of published papers that associate DICTION categories with political speech which you could use to match.
posted by DiscourseMarker at 8:47 PM on July 14, 2020


A colleague and friend of mine, Dorottya Demszky, has done some work with others that you may find interesting. If I'm remembering correctly, she trained a model on millions of tweets, classifying users based on which politicians they were following on Twitter, and then doing some novel NLP stuff to figure out how things were being framed linguistically. Her work ended up being covered by The Washington Post. You can find her research page here.
posted by vocativecase at 3:44 PM on July 15, 2020


« Older Interesting uses for Concord grapes?   |   AH HA idea.... but what now... Newer »
This thread is closed to new comments.