<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Search, cross-link references in PDF articles?</title>
	<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles/</link>
	<description>Comments on Ask MetaFilter post Search, cross-link references in PDF articles?</description>
	<pubDate>Sun, 20 Jun 2010 23:45:39 -0800</pubDate>
	<lastBuildDate>Sun, 20 Jun 2010 23:45:39 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Search, cross-link references in PDF articles?</title>
		<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles</link>	
		<description>Perhaps this is just a fantasy, but is there any application or online tool that could search through the references of an article I have saved as a PDF, in order to check whether I have those cited articles in my larger PDF library? It would be perfect if it would highlight, link, or somehow display cross-referenced relationships between all my articles. I am already familiar with many referencing/PDF organization software such as &lt;a href=&quot;http://mekentosj.com/papers/&quot;&gt;Papers&lt;/a&gt; (Mekentosj), &lt;a href=&quot;http://www.thirdstreetsoftware.com/site/introduction.html&quot;&gt;Sente&lt;/a&gt;, and &lt;a href=&quot;http://www.devon-technologies.com/products/devonthink/&quot;&gt;Devonthink&lt;/a&gt;. I suppose what I have in mind is a similar tool, but with the additional power of something like ISI Indexes. I have a fairly large library (about 300 references that I&apos;m actively using, and more than 2000 total) and I&apos;m just trying to get some &quot;big-picture&quot; grasp of how all these sources relate to one another.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2010:site.157276</guid>
		<pubDate>Sun, 20 Jun 2010 20:44:01 -0800</pubDate>
		<dc:creator>samac</dc:creator>
		
			<category>research</category>
		
			<category>software</category>
		
			<category>cross-reference</category>
		
			<category>mac</category>
		
			<category>academia</category>
		
			<category>school</category>
		
			<category>pdf</category>
		
			<category>internet</category>
		
			<category>education</category>
		
			<category>writing</category>
		
	</item> <item>
		<title>By: stratastar</title>
		<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles#2254391</link>	
		<description>You may just have to brute force it yourself. Maybe use a mind-map software.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2010:site.157276-2254391</guid>
		<pubDate>Sun, 20 Jun 2010 23:45:39 -0800</pubDate>
		<dc:creator>stratastar</dc:creator>
	</item><item>
		<title>By: James Scott-Brown</title>
		<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles#2254429</link>	
		<description>I&apos;ve wondered this myself, and it seems that there is not. The main problem is getting details of which papers cite which others. There are proprietary databases (eg. web of science) and some free, domain-specific databases (eg. citeseer), and some general databases (Google Scholar, but it is very incomplete), but there&apos;s isn&apos;t a complete, freely accessible database of citations. &lt;br&gt;
&lt;br&gt;
So, the program would have to extract citation data from the PDFs themselves. This is hard, because of the wide variety of different citation formats,  incomplete citations (and for obscure journas that aren&apos;t indexed anywhere, disambiguating two incomplete citations may be impossible), &lt;a href=&quot;http://blogs.nature.com/thescepticalchymist/2008/01/journal_journeys_day_2_the_lon.html&quot;&gt;multiple abbreviations&lt;/a&gt; for the same journal, differences in how names are spelt, citation errors (I&apos;ve read appears in which the authors mis-cite their own previous work!), and abbreviations like &lt;i&gt;op. cit.&lt;/i&gt; and &lt;i&gt;ibid&lt;/i&gt;. For a more detailed discussion of the difficulties, search google or citeseer for &quot;citation extraction&quot;.&lt;br&gt;
&lt;br&gt;
But if all the citations include DOIs, and you have tagged all your PDFs with their DOIs, the problem becomes &lt;i&gt;much, much easier&lt;/i&gt;, and could be done fairly easily with a perl script. Unique numerical identifiers are the future!</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2010:site.157276-2254429</guid>
		<pubDate>Mon, 21 Jun 2010 02:16:59 -0800</pubDate>
		<dc:creator>James Scott-Brown</dc:creator>
	</item><item>
		<title>By: cromagnon</title>
		<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles#2254482</link>	
		<description>The biggest problem is access to the Web Of Science/other Thomson Reuters database APIs. This used to be pretty much impossible last time I looked, but this seems to suggest that things are changing slightly:&lt;br&gt;
&lt;br&gt;
&lt;a href=&quot;http://bibwild.wordpress.com/2009/04/13/cited-by-from-isi-and-scopus-in-the-link-resolver/&quot;&gt;http://bibwild.wordpress.com/2009/04/13/cited-by-from-isi-and-scopus-in-the-link-resolver/&lt;/a&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2010:site.157276-2254482</guid>
		<pubDate>Mon, 21 Jun 2010 04:46:36 -0800</pubDate>
		<dc:creator>cromagnon</dc:creator>
	</item><item>
		<title>By: stratastar</title>
		<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles#2280275</link>	
		<description>Just to continue the conversation, Mendeley DOES pull citations from within the PDFs themselves.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2010:site.157276-2280275</guid>
		<pubDate>Fri, 09 Jul 2010 16:00:31 -0800</pubDate>
		<dc:creator>stratastar</dc:creator>
	</item><item>
		<title>By: James Scott-Brown</title>
		<link>http://ask.metafilter.com/157276/Search-crosslink-references-in-PDF-articles#2296874</link>	
		<description>stratastar, as far as I can tell, Medeley just extracts the bibliographic metadata for a PDF you import (as do other programs, like Papers); I don&apos;t think it extracts bibliographic metadata for papers &lt;i&gt;mentioned/cited in&lt;/i&gt; a PDF. &lt;br&gt;
&lt;br&gt;
I think samac wants a program to do the latter, which is much harder.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2010:site.157276-2296874</guid>
		<pubDate>Wed, 21 Jul 2010 08:08:02 -0800</pubDate>
		<dc:creator>James Scott-Brown</dc:creator>
	</item>
	</channel>
</rss>
