<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: What are some ways I can improve a digital library?</title>
	<link>http://ask.metafilter.com/95418/What-are-some-ways-I-can-improve-a-digital-library/</link>
	<description>Comments on Ask MetaFilter post What are some ways I can improve a digital library?</description>
	<pubDate>Mon, 30 Jun 2008 20:43:48 -0800</pubDate>
	<lastBuildDate>Mon, 30 Jun 2008 20:43:48 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: What are some ways I can improve a digital library?</title>
		<link>http://ask.metafilter.com/95418/What-are-some-ways-I-can-improve-a-digital-library</link>	
		<description>I&apos;m in the process of revamping a digital library website and I have a few questions about how to link my resources into both Google Scholar and the academic/archival community at large. &lt;br /&gt;&lt;br /&gt; &lt;ol&gt;&lt;li&gt;I know that it&apos;s possible to allow Google Scholar to do a full search of the PDF documents while still making them restricted to normal visitors. I want to do this so that Google Scholar can do a better search of our documents, while still allowing for a subscription model. I suppose I could offer either IP or user-agent based subscriber access to the website, but I know that google often doesn&apos;t look kindly upon websites that serve them up different content. Is there a sanctioned way to do this?*&lt;/li&gt;&lt;br&gt;
&lt;li&gt;Assume I want documents on this site to get &quot;linked in&quot; to the rest of the academic world. What are things I can do to make this easier and better? I&apos;ve already implemented OpenURL, kind of (is it really just as simple as making a page like &lt;tt&gt;/resolver?issn=blah&amp;amp;volume=blah&amp;amp;issue=blah&amp;amp;spage=blah&lt;/tt&gt; ?). What other standards would be good to support/implement?&lt;/li&gt;&lt;/ol&gt;&lt;br&gt;
If you&apos;re a frequent digital library user I&apos;d also be interested in hearing about features that would make you revisit a digital library on a regular basis, and similarly if anyone out there has developed a digital library in the past, are there any tools or programs (preferably Java-based) that you might recommend that speed up the document handling process?&lt;br&gt;
&lt;br&gt;
More details: this is for a non-profit educational organization that has around 10k (and growing) scholarly (peer reviewed and published) papers. They&apos;re imported in standard PDF format so thankfully issues of OCR or conversion are not an issue although I would be really interested in ways to pull out metainfo or even things like references and citations.&lt;br&gt;
&lt;br&gt;
&lt;small&gt;* Yes, I realize there is a &lt;a href=&quot;http://www.google.com/support/scholar/bin/request.py&quot;&gt;contact page&lt;/a&gt; for this. When I submitted a request, I received a response something along the lines of &quot;Currently due to a huge number of requests you won&apos;t hear from us, like, ever&quot;&lt;/small&gt;</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2008:site.95418</guid>
		<pubDate>Mon, 30 Jun 2008 14:42:53 -0800</pubDate>
		<dc:creator>Deathalicious</dc:creator>
		
			<category>digitallibrary</category>
		
			<category>libraries</category>
		
			<category>googlescholar</category>
		
	</item> <item>
		<title>By: nev</title>
		<link>http://ask.metafilter.com/95418/What-are-some-ways-I-can-improve-a-digital-library#1393203</link>	
		<description>I build sites like this for a living, so one possibility is to hire me as a consultant!&lt;br&gt;
&lt;br&gt;
Barring that, my experience with Google Scholar was that only large academic publishers tend to get their attention.  If drop me a MeMail I&apos;ll see what I can dig up from talking to colleagues who&apos;ve waded through before.  (I know in at least one case we just allowed the Googlebot user-agent in with no repercussions, although know that all similar methods are easily spoofed).&lt;br&gt;
&lt;br&gt;
As far as metadata goes, it&apos;s a hard problem to extract anything structured from PDFs.  I could probably help, but I&apos;d have to see your content.&lt;br&gt;
&lt;br&gt;
Good luck!</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.95418-1393203</guid>
		<pubDate>Mon, 30 Jun 2008 20:43:48 -0800</pubDate>
		<dc:creator>nev</dc:creator>
	</item><item>
		<title>By: Mr. Gunn</title>
		<link>http://ask.metafilter.com/95418/What-are-some-ways-I-can-improve-a-digital-library#1393305</link>	
		<description>I know there are a lot of people in academic publishing on friendfeed, so you could try asking there.  Maybe &lt;a href=&quot;http://friendfeed.com/billhooker&quot;&gt;Bill&lt;/a&gt; or &lt;a href=&quot;http://jdupuis.blogspot.com/&quot;&gt;this guy&lt;/a&gt; could steer you in the right direction.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.95418-1393305</guid>
		<pubDate>Tue, 01 Jul 2008 01:20:33 -0800</pubDate>
		<dc:creator>Mr. Gunn</dc:creator>
	</item><item>
		<title>By: Mike1024</title>
		<link>http://ask.metafilter.com/95418/What-are-some-ways-I-can-improve-a-digital-library#1393315</link>	
		<description>&lt;i&gt;If you&apos;re a frequent digital library user I&apos;d also be interested in hearing about features that would make you revisit a digital library on a regular basis,&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
1. Having quality papers relevant to what I&apos;m researching.&lt;br&gt;
2. Me being able to find them.&lt;br&gt;
3. Me being able to access them.&lt;br&gt;
4. Copies of documents being complete.&lt;br&gt;
&lt;br&gt;
By (1) I mean the obvious; if I&apos;m researching physics and your library is about psychology, we&apos;re unlikely to interact. If I have read papers and found them relevant, informative, clear, readable, and factually accurate, I am more likely to look at the other papers in that journal/issue/library.&lt;br&gt;
&lt;br&gt;
By (2) I mean the papers showing up in my searches (I like Google, Web of knowledge, and science direct - but that&apos;s just me) and having clear title and abstract.&lt;br&gt;
&lt;br&gt;
(3) is because putting in a document supply request can take two weeks. If my institution has access to your library, that&apos;s great. If I&apos;m going to have to wait two weeks to pay for that paper, that&apos;s two weeks to find another paper in a journal I *can* access instantly. Also, the whole point of giving references is that people can follow them; from this perspective a reference that leads to an $80 pay wall isn&apos;t a very good reference.&lt;br&gt;
&lt;br&gt;
By (4) I mean if you&apos;re offering a PDF copy of a book, and that book comes with a CD of example programs or suchlike, you should offer access to the CD alongside the PDF of the pages of the book.&lt;br&gt;
&lt;br&gt;
In summary you&apos;re going in the right direction getting indexed by Google. You also need good papers with clear titles and abstracts; and ideally you need major institutions to subscribe to your library so users can access it easily.&lt;br&gt;
&lt;br&gt;
Of course, now I summarise it, I probably haven&apos;t told you anything you don&apos;t already know!</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.95418-1393315</guid>
		<pubDate>Tue, 01 Jul 2008 02:26:56 -0800</pubDate>
		<dc:creator>Mike1024</dc:creator>
	</item>
	</channel>
</rss>
