<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Evaluating Lucene</title>
	<link>http://ask.metafilter.com/18053/Evaluating-Lucene/</link>
	<description>Comments on Ask MetaFilter post Evaluating Lucene</description>
	<pubDate>Wed, 27 Apr 2005 05:56:14 -0800</pubDate>
	<lastBuildDate>Wed, 27 Apr 2005 05:56:14 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Evaluating Lucene</title>
		<link>http://ask.metafilter.com/18053/Evaluating-Lucene</link>	
		<description>I&apos;ve been assigned to evaluate the feasibility of using &lt;a href=&quot;http://lucene.apache.org/java/docs/&quot;&gt;Lucene&lt;/a&gt; for our website, a large, high-use government site with rapidly-changing data. &lt;br /&gt;&lt;br /&gt; It&apos;s not the only search engine we&apos;re evaluating, but it&apos;s the one I&apos;m looking at. Anything I should know?&lt;br&gt;
&lt;br&gt;
I know this is a very general question, but I&apos;m not particularly technical, and I&apos;ve just started looking into it, so I don&apos;t even know what questions to ask.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2005:site.18053</guid>
		<pubDate>Wed, 27 Apr 2005 05:36:13 -0800</pubDate>
		<dc:creator>MrMoonPie</dc:creator>
		
			<category>software</category>
		
			<category>web</category>
		
			<category>computers</category>
		
			<category>internet</category>
		
			<category>programming</category>
		
			<category>searchengines</category>
		
	</item> <item>
		<title>By: orthogonality</title>
		<link>http://ask.metafilter.com/18053/Evaluating-Lucene#300262</link>	
		<description>MrMoonPie &lt;a href=&apos;http://ask.metafilter.com/mefi/18053&apos;&gt;posted&lt;/a&gt;  &lt;em&gt;&quot;I know this is a very general question, but I&apos;m not particularly technical, and I&apos;ve just started looking into it, so I don&apos;t even know what questions to ask. &quot;&lt;/em&gt;&lt;br&gt;
&lt;br&gt;
So test it. You need to see how it works for the end-user., so you don&apos;t need to be that technical.&lt;br&gt;
&lt;br&gt;
Install a trial copy, and then ask your boss to give you ten GS-7s for hour a day for two weeks to do searches (you want all the 7s at one time, as you&apos;re trying to test how the thing responds when multiple users hit it, among other things).&lt;br&gt;
&lt;br&gt;
Each day, give the 7s a list of thirty things to find (&lt;i&gt;e.g.&lt;/i&gt;, &quot;our last press release that mentions panda bears&quot;). Emphatically let the let the 7s know you&apos;re &lt;i&gt;not&lt;/i&gt; testing them or their speed, you&apos;re testing the search software. Spend forty minutes in this, then spend the remaining twenty minutes collecting the qualitative results -- how many of the thirty items were found by each 7, and how long each search took -- and the 7s&apos; subjective response. They&apos;ve used Google; ask them how Lucene compares.&lt;br&gt;
&lt;br&gt;
Finally, spend two hours with five GS-15s (try for a mix of lawyers and engineers) doing the same thing. Have the 15s come up with their own rather technical searches, and then have each one hand his list of searches to another one of the 15s.&lt;br&gt;
&lt;br&gt;
Finally, get someone who is technical to give you a run-down of the technical pros and cons of the software.&lt;br&gt;
&lt;br&gt;
At the end of the whole thing, ask your 7s and your 15s if Lucene is easy to use and makes your agency look good. Compile the quantitative results and qualitative results into a report for your boss.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2005:site.18053-300262</guid>
		<pubDate>Wed, 27 Apr 2005 05:56:14 -0800</pubDate>
		<dc:creator>orthogonality</dc:creator>
	</item><item>
		<title>By: Loser</title>
		<link>http://ask.metafilter.com/18053/Evaluating-Lucene#300376</link>	
		<description>I use Lucene pretty extensively. As a &lt;em&gt;developer&lt;/em&gt;, I appreciate how easy it is to write my own custom indexers and searchers. Plus there&apos;re neat tools like &lt;a href=&quot;http://www.getopt.org/luke/&quot;&gt;Luke&lt;/a&gt; that let me sift through the index and double check the data.&lt;br&gt;
&lt;br&gt;
Is your website, J2EE based? I found it pretty trival to integrate Lucene into a Tomcat/Velocity environment, but it could be a different experience if you&apos;re running IIS.&lt;br&gt;
&lt;br&gt;
On preview, I&apos;d go straight to user testing with some GS-30s. I hear they&apos;re, like, twice as good as GS-15s.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2005:site.18053-300376</guid>
		<pubDate>Wed, 27 Apr 2005 09:29:28 -0800</pubDate>
		<dc:creator>Loser</dc:creator>
	</item><item>
		<title>By: Fezboy!</title>
		<link>http://ask.metafilter.com/18053/Evaluating-Lucene#300477</link>	
		<description>The other developer with my project has been doing some testing of Lucene and has been  quite pleased with how easy it is to do indexing.  She is also enamoured with the transparency of how result sets are generated.  FWIW, our data is all METS/MODS XML so Lucene is ideal for our purposes.  We have also been using Oracle&apos;s XDB but are looking to put together an entirely OSS version of our application.  &lt;br&gt;
&lt;br&gt;
The biggest problem she has run across is indexing accent-free unicode.  Much of our data runs outside the core Latin-1 and we need to index with and without the special characters (ie &apos;resume&apos; will match &apos;resume&apos; and &apos;resum&#233;&apos;.  And, on glancing at my inbox, it appears she has solved this problem now...&lt;br&gt;
&lt;br&gt;
Sorry, I can&apos;t be more specific than this.  I&apos;m the interface/usability guy and a bit weak when it comes to how the back-end of our project is put together.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2005:site.18053-300477</guid>
		<pubDate>Wed, 27 Apr 2005 11:01:18 -0800</pubDate>
		<dc:creator>Fezboy!</dc:creator>
	</item>
	</channel>
</rss>
