Topical connections between existing documents?
March 14, 2014 8:31 AM   Subscribe

Does anybody know of any software solutions for manually capturing connections between specified regions within and between documents?

(First post; already searched; please be gentle.)

Say I have a collection of large documents, and portions of some documents all deal with a common topic. I want to flag a passage within a document and define a connection to another passage elsewhere in the document, or a different document altogether. Upon seeing the connection and the text involved, a human would understand that the two passages are related.

The purpose of the notional software is to enable active reading and research on particular topics across various works. Importantly, I want to preserve the ability to quickly access the text surrounding each flagged passage to provide context. If I were dealing with paper books, it would be trivial to construct such a system using coloured flags and highlighters -- until I ran out of colours.

Granularity is important: Defining links between files (with a tagged filesystem like Tagsistent, for example) is insufficient, I want something at the level of words, sentences and paragraphs.

The document formats would likely be plain text, PDF, HTML, and OpenDocument Text. It is possible that the contents of the documents themselves would grow over time.

I would like to achieve this with a minimum of mental gymnastics such as remembering the names of tags or named links.

This is a tall and open-ended order, and it's entirely conceivable that I may need to build it myself. My point in asking is to see if anybody has thought about the same thing, or can suggest similar solutions that might lead in the right direction.
posted by Verg to Computers & Internet (6 answers total) 2 users marked this as a favorite
Best answer: Project Xanadu, "the longest-running vaporware story in the history of the computer industry" was and/or is, among other things, this.
posted by BungaDunga at 8:41 AM on March 14, 2014

Sounds like something that legal analysis software would do. Here's a NYT article about it.
posted by Sophont at 9:21 AM on March 14, 2014

Tagged links that go to specific regions of other files - aren't you just describing HTML here?

PDF can have embedded hyperlinks, and plain text is trivially wrapped into HTML. Not sure about OpenDoc. The only problem seems to be the ability to backtrack - you have to rely on the browser for that.

I'd be wary of getting sucked into a huge monolithic software solution where the value grows as I use it but leaves me locked in further and further...
posted by RedOrGreen at 9:22 AM on March 14, 2014

NVivo is a qualitative analysis software that claims it can "auto code" text but I've never tried the feature myself. But the software itself is robust and will do all the other things you want to do manually. Their auto-code feature could be worth a shot.

You can "flag" things with something called "Nodes" in NVivo - you can highlight text and "tag" it with nodes, and then view the nodes to see all the related content in one place. NVivo is super heavy and intense and will allow you to do a TON of things with your data.

They offer a 30-day free trial so you could mess around with it and see if it's what you want. I'd suggest getting a copy of the book Qualitative Data Analysis with NVivo, too - it's really helpful.
posted by k8lin at 9:29 AM on March 14, 2014 [1 favorite]

Best answer: Oh, sorry, I totally mis-read, and thought you wanted to automate this.

NVivo will do exactly what you want. Check it out. It's expensive, but it's a great piece of software for analyzing text. It might, however, be too good for what you're looking for - what you really want is the annotation system a colleague of mine has been developing for the past ten years, but that's still a ways off.
posted by k8lin at 9:33 AM on March 14, 2014

Response by poster: Thanks; the suggestion of NVivo eventually lead to the Wikipedia article for "Computer-assisted qualitative data analysis software," which is related to what I am trying to do. It's a good start.
posted by Verg at 5:07 PM on March 16, 2014 [1 favorite]

« Older Is there a way to diagnose my twitter unfollowings...   |   Small business owners: how do you keep track of... Newer »
This thread is closed to new comments.