Indexing the LAN server
May 11, 2011

I'm wondering how I can do contextual search and image search on a local server volume and have this be a service available to everyone on our network. Open source methods are preferred.

In my ongoing quest to find ways my company can share information more effectively, I'm looking at our local server volume and wishing I could get information off it more easily.

With a tool like OS X Spotlight, I can index a volume by name, creation date, context (meaning contents of the file) and other values. With a program like Picasa, I can index a volume for all images and create an index of them.

What I wish I had was those functions provided as a service on an internal web page. Is there an Open Source application that can provide search engine capability for a LAN server volume?

We have a mixed network of PC's mostly and some Apples with a Linux-based server environment. Having everyone's machine doing indexing of the server volume seems inefficient to me.

I would also hope to extend the service with saved searches of particularly useful files that people could browse and open the file with a clicked link.

Thanks for any info.
posted by diode to Computers & Internet (6 answers total)
What is your comfort level with programming, and what languages are you comfortable with?

The reason I ask is that Apache Solr (tutorial link) may be up your alley but it requires more than a passing knowledge of Java to get by in. Ideally you should require no Java knowledge to use Solr.
posted by asymptotic at 9:14 AM on May 11, 2011

Response by poster: Probably beyond my abilities in this incarnation. I'm hoping for some kind of software that would drill through a server volume, then present the results on a searchable web page.
posted by diode at 5:06 PM on May 11, 2011

Best answer: I'm disappointed noone else pitched in. Have you considered using Google Desktop? Google Desktop is fantastic at indexing network drives, but unfortunately only supports web server connections from people on the local default.

You can, however, override this behaviour:
So what can we do? I tried to find if there was a way to change some of the Google Desktop Search settings to allow for indexing network drives. According to the FAQ the tools will not index a network drive. But with some registry setting changes we can have the Google Desktop Search engine scanning mapped network drives...

However, how can I make this Google Desktop Search engine available to a team? What if I install it on a computer that is always on and could be used as a "Search Server"?

Google Desktop Search can be installed on any PC, but the built-in web server will only allow localhost connections. But even this can be changed.
The basic solution is to install a free HTTP proxy on the computer to allow remote connections to reach the Google Desktop HTTP server. Your mileage may vary, but give it a shot, it's free right?
posted by asymptotic at 2:08 AM on May 13, 2011

Response by poster: Thanks, that's a great tip.

posted by diode at 2:22 PM on May 13, 2011

Response by poster: One thought comes to mind about using Google Desktop. Does this expose your local data to Google? Would I be, in a sense, sending data to Google regarding every document on a server by setting up this search engine?
posted by diode at 3:46 PM on May 13, 2011

No, the actual content indexed doesn't get sent to Google. I'm pretty sure usage data gets anonymously sent to Google though, the content of which I don't know. I'm guessing it's stuff like number of searches per month, did it crash, how did it crash, etc. And I'm quite sure you can disable this.
posted by asymptotic at 1:36 PM on May 16, 2011

