MeSH term logical grouping?
May 23, 2017 11:10 AM   Subscribe

I have a list of MeSH terms and usage frequencies gleaned from our research institution. Is there a tool I can use to generate some sort of functional grouping of the terms, so that I can classify our researchers into areas of related interest?

The end goal would be to use this data to encourage collaboration or point out areas of strength. In a lot of cases the overlap is obvious - e.g. "brain concussion", "brain injuries" and "brain injuries, traumatic" - those groups should have something in common - but in other cases the overlap/connection is less clear. Plus, with nearly 400 identified terms of interest, I don't want to do this manually. Any suggestions? Google is no help here, because "MeSH" gives me too many unrelated results and "Medical Subject Headings" seems to be ignored...
posted by caution live frogs to Science & Nature (7 answers total) 2 users marked this as a favorite
MeSH are already organized into a hierarchy of terms -- here is the tree view. Could you use the terms' place in the taxonomy to group them?

(Forgive me if you already know this, but the fact that you're Googling MeSH makes me think you're not too familiar with it.)
posted by rabbitrabbit at 11:20 AM on May 23, 2017 [1 favorite]

you may want to try pasting your lists into this tool:
It will suggest mesh headings based on the text. It could help you find unusual pairings/overlaps

You might want to look at all the tools here, actually:
posted by cosmicbandito at 11:26 AM on May 23, 2017 [2 favorites]

Let me throw more info at you.

I built a form that was distributed to the core group of investigators in our facility. One of the questions asked was to supply research interests, in MeSH keywords if possible. I took the output of this form and used MeSH On Demand to build a keyword list for all responders, then pooled the results to look at response frequencies.

The issue is that the specific keywords folks chose are sometimes too specific to be of much use in creating heirarchical groupings of personnel. For example, five people may have responded that they work on cancer research, but with different MeSH terms - sarcoma vs. neoplasm vs. tissue-specific neoplasms, so the fact that we have a core group of cancer researchers does not become evident by looking at response frequency. What I am looking for is a tool that would allow me to dump the response list into it, and have it spit out a "simplified" list showing only higher-level groupings rather than fine details.

In biology terms - I don't care if Researcher A is a ferret and Researcher B is a fisher and Researcher C is a mink - what matters is that we have a large group of weasels, so opportunities for weasels should be a priority for our facility.

Does such a thing exist? Googling "MeSH functional groupings" or "MeSH heirarchy" is the problem - MeSH is treated by Google as too generic to give me relevant results, so I get computational modeling papers on mesh networks, not tools for grouping MeSH keywords.
posted by caution live frogs at 1:41 PM on May 23, 2017

Can you program or do you know anyone who you can get to program this for you? The MeSH structure is downloadable and it shouldn't be too difficult to take a list of MeSH leafs and condense them either to nodes at a certain level or just propagate annotations up the tree and report all the nodes by number of annotations.

I don't know of a ready-made tool for it.
posted by grouse at 2:09 PM on May 23, 2017 [2 favorites]

Yes what grouse describes is relatively straightforward for someone with the right kinds of skills. I am skeptical such a specific ready-made tool would already exist. If this were an academic unit at a research university, I'd be handing this list of MeSH leaves to someone in IT/computational resources support, or perhaps the right kind of grad student.

That is, the "logical grouping" or "functional grouping" i.e. taxonomic grouping of MeSH terms already exists. You should not create a new/different one. If you did, then they wouldn't really be MeSH terms, they would just happen to have the same characters. The MeSH system is available for download in XML format. If I had to do this thing, I would look for XML tools that I could use on your list of terms and the MeSH XML.
posted by SaltySalticid at 2:42 PM on May 23, 2017

Also: with 400 hundred terms, I'd bet you could do the thing grouse is talking about manually faster than you can learn how to effectively do this programmatically. So unless you can get help or someone does find an existing tool, I'd bite the bullet and just do it. See also this relevant XKCD, and also this one. If you do decide to spend the time on automation, here is a list of R tools for wrangling XML that might be helpful.
posted by SaltySalticid at 2:50 PM on May 23, 2017 [2 favorites]

Yeah, manually is probably better. Shame that there is no good tool for this. I've made decent use of the PubMed eUtils in the past to scrape data, but for this it isn't worth scripting anything. Thanks all.
posted by caution live frogs at 10:48 AM on May 25, 2017

« Older Need Help with Tumblr Hashtags   |   Vintage Canon Camera Question. Looking for... Newer »
This thread is closed to new comments.