How can I rate abstracts for relevance and quality?
November 5, 2009 10:03 AM   Subscribe

I'm looking for ways to rate abstracts for relevance and quality so that I can filter out the best ones.

I'm going to be doing a very large literature review. In preparation I need to set up a rules-based system for filtering through, or triaging, the abstracts I retrieve, in order to select a subset (those articles I'll actually retrieve and read). Ideally this system will allow me to rate abstracts for quality and/or how well they match the topic of concern. (This is all work-related, so I can't be more specific, but the topic is generally in the social sciences.) There might have to be some qualitative or subjective aspect to this ratings system, but there are a lot of ways to rate abstracts more objectively as well (breadth of study population, number of search keywords "hit", etc.).

I'm not formally trained in information science, but I suspect such systems have already been created and used, and I don't want to reinvent the wheel. I know there are specialists in information science here that might be able to keep me from doing so. Could you point me to projects where something similar has been done, or places where such systems have been reviewed, described, or codified? Can you suggest any other resources that might be of use?

If it matters, I think I will have around one or two thousand abstracts to filter through.
posted by Herkimer to Work & Money (6 answers total) 2 users marked this as a favorite
Information retrieval theory! They teach grad level classes in it 'cause it's hard! OK, and fun. For nerds. Such as myself.

Where did these abstracts come from (like, did someone hand you a pile of paper? Are they search results?) What form are your abstracts in? This is information retrieval theory in action! If these items exist in a database already, for example, then the structure of the database (keywords, thesauri, taxonomies etc) will affect your search strategies.

To start with, you may want to think about whether you want to err on the side of caution (get everything you want, maybe some stuff you don't want, or "high recall") or whether you want to cast care to the winds and only read stuff you do want (risking missing some things you might want to see, of "high precision".) Obviously you want to maximize both, but it may be a trade-off.

Let's pretend that you're doing anthropology, and you have a bunch of abstracts about, say, post-Colonial theory but you really only want stuff about Gayatri Spivak (WARNING: HERE BE HALF-REMEMBERED UNDERGRADUATE STUDIES) and the subaltern. So obviously you will want everything with "Spivak" and "subaltern", but maybe you also want things that touch on "strategic essentialism." Maybe you want things where "strategic" and "essentialism" are within a few words of each other, in order to include sentences like "Dude, that essentialism is totally strategic." The more you include, the more likely you are to get stuff that's less relevant, but you're more likely to see everything that is relevant.

You can also use these concepts to prioritize - to say "these things I KNOW will be relevant", and deal with those first, and leave the less certain stuff for later.

I'm rambling on because this is interesting to me, but I'm sure there are current social science librarians who can help you further. In any case, feel free to memail me.
posted by chesty_a_arthur at 10:59 AM on November 5, 2009

What is your time budget on this, or, how many hours do you have for the triaging, and
how many hours do you have for the reading of the selected papers?
posted by the Real Dan at 11:00 AM on November 5, 2009

Response by poster: Chesty, thanks! The abstracts have come/will come from literature searches. There are a number of subtopics that we're interested in, so our searches will look more or less like: x AND y; x AND z; x AND q AND r; etc. And there are things like study population breadth that we're not specifically searching on (not really possible, I think) but that play a big role in how interesting an article looks.

So I can certainly think up ways to translate those search interests into algorithms. But, given I'm not a grad student in information science (sadly! if I had my grad degree to do over...) is there a source that could help me pick up some of the basics, or essentials, of the theory that would be most applicable to our situation?

The Real Dan, I don't know yet. But I don't think time will be a hugely limiting factor. However, just in the interests of making real use of these strategies, I'd like to have a reasonably efficient process.
posted by Herkimer at 11:09 AM on November 5, 2009

Papers is a mac program specifically designed to help you organize and search through a large number of journal article PDFs you've downloaded... sorting by journal title, author, year, keywords, etc. There are also several options for arranging data into folders as you see fit, too - You might be able to accommodate your rating system in that way.

Sorry if this isn't quite what you were looking for, but I imagine someone else interested in this thread would find it helpful in any case.

From the rave reviews my friends have given the program, I certainly hope you are fortunate enough to be a mac user in this case. If not, hopefully you can find something similar for the PC. Wish I could have used it before i finished my master's, gosh darn it...
posted by lizbunny at 3:10 PM on November 5, 2009

Sente is like the above mentioned Papers (and also only for Mac), except that in addition to letting you rifle through your library in a number of interesting and different ways, it will also search a number of different databases for you (Google Scholar, PubMed, JSTOR, there are hundreds), and pull down abstracts (or even whole articles if you want) and stick em' directly in your library. You can then implement a ton of different searches on these articles, sorting em' by whatever relevant criteria you like. You can even rank order articles that you like and set it up to prioritize new ones based on your ranking system.

This program has seriously changed the way I conduct lit reviews and keep up with what's happening in my field in general. It is totally rad to the max, and you can try it for a month for free.

I swear that I am in no way affiliated with Third St. Software beyond my fanatical love for this particular product, overzealous as I may sound.
posted by solipsophistocracy at 4:10 PM on November 5, 2009

I'm not formally trained in information sciences either, but I have had to do similar projects before. A lot of people have published on the topic of rating full text articles - see this, for example - but it's definitely necessary to do some paring down at the abstract stage. What I usually do is work through my pile of abstracts (easiest for me to do this on hard copy) and make clear notes on each one about whether or not it fits my search criteria. This is usually not that hard.

For example, say I'm doing a lit review of cost-effectiveness of treatments for multiple sclerosis. Up front, I have a list of reasons for excluding articles altogether:
  • Not an economic analysis
  • Review/opinion article with no meta-analysis
  • Fewer than 30 subjects per study arm
  • Trial not conducted in a relevant country (often I'm looking specifically for US studies)
  • Study concerns a treatment that is no longer in use in the US
  • etc.
Once I have narrowed it down to just the abstracts for articles that might contain something relevant, I usually pull the full texts of all of them. Even if a particular article itself isn't useful, I still need to go through each bibliography to identify articles that I may have missed in my searches of lit databases.

Hope this helps. Feel free to mefi-mail me if you have further questions.
posted by acridrabbit at 10:50 AM on November 6, 2009

« Older Help me find a word   |   Idiodome Newer »
This thread is closed to new comments.