PHP Search Script
April 7, 2005 9:41 AM   Subscribe

Can you recommend a good, free, PHP/MySQL search script I can add to a web site?

I've tried PhpDig, but it's a pain to configure.
posted by kirkaracha to Computers & Internet (10 answers total) 1 user marked this as a favorite
 
Swish-e sucks slightly less, or perhaps more accurately... it sucks differently.
posted by togdon at 11:14 AM on April 7, 2005


"PHP and MySQL" are sort of odd requirements for this sort of thing, for what it's worth. MySQL offers full-text indexing, but I doubt you're planning on keeping a copy of the full text of your website in MySQL; in that case, you're better off trusting the decisions of the people writing search software in terms of how their software stores its indices.

Similarly, full-text search isn't a straightforward problem so there isn't a whole lot of competition in that area, and the folks who are working in that area probably aren't using PHP to do it.

If there are some practical reasons why only solutions which are written in PHP and store data in MySQL are acceptable, then that's where you're stuck, but I think you're going to be very disappointed.

I used to use htdig for a previous employer (and it seems they're still offering it, although they favour Google now). It, too, sucked less and differently. Its crawler was particularly flexible and well-behaved, which is always a nice touch.

These days I think you really only want to implement local search when there's some reason that Google Site Search won't perform well.
posted by mendel at 11:29 AM on April 7, 2005


Well, the site's built in PHP, with the content in MySQL. PhpDig stores the content in a MySQL table, so I thought other options would work the same way. Google site search sounds promising.
posted by kirkaracha at 12:04 PM on April 7, 2005


You'll be much better off just writing something yourself. It'll be worth the effort.
posted by exhilaration at 1:13 PM on April 7, 2005


I can't stand blob content being stored in any database. It's unwieldy, and the resources involved in searching it are way beyond what you're asking of it to deliver.

A simple solution in these cases is always the best. If you've got the power, save out all your big-text into flat text files, indexing the files in the MySQL database by keyword, subject, and some sort of ID.

Then, when you want to search the content, open a shell and use grep. Nothing's faster, and you can use regular expressions.
posted by thanotopsis at 1:41 PM on April 7, 2005


WTF? "Write it yourself"? "grep"? kirkaracha's looking for an easy-to-configure search script...

After spending about a day looking for acceptable PHP/MySQL solutions recently I gave up and went CGI. Perlfect is free, configurable and has an automatic web-based setup. Recommended.
posted by blag at 2:27 PM on April 7, 2005


Yes, "grep" is about as slow as full-text searching can get, since it doesn't maintain an index. Every time you search it has to look at every byte in every file. The only way you could get a less efficient search is to look at the same thing twice.
posted by mendel at 5:30 PM on April 7, 2005


Yes, "grep" is about as slow as full-text searching can get, since it doesn't maintain an index. Every time you search it has to look at every byte in every file. The only way you could get a less efficient search is to look at the same thing twice.

Far be it for me to throw in anecdotal evidence in the mix, but grep beats any database search (even those that allow the indexing of a blob field -- *shudder*) for response time, CPU time, and Memory usage. There's no comparison. The method may not be as efficient, but the results certainly are.

If you've got a hierarchical structure for the textual data, it's even handy to ID your file, store it in a file structure that matches your hierarchy, and then use the resulting file handle as extended data identification.
posted by thanotopsis at 5:42 PM on April 7, 2005


Thanotopsis, I'd love to see your data -- not that I outright don't believe you, but instead, that I find it very hard to believe you.
posted by delfuego at 7:56 PM on April 7, 2005


Far be it for me to throw in anecdotal evidence in the mix, but grep beats any database search (even those that allow the indexing of a blob field -- *shudder*) for response time, CPU time, and Memory usage. There's no comparison.

This is incorrect. Stop thinking about blobs; this is not an RDBMS, this is a search engine. You're not storing a new copy of the text and performing searches on it. That's precisely what I said not to do in my first post when I said that MySQL was not relevant here.

Let's take a trivial example: each word in a document is stored in an alphabetically-sorted array, along with the document's URL. (Remember, one goal of a search engine is to move the complexity to the indexing phase.) You can do a binary search on that in O(log n) time.

Using grep means that the complexity is not shifted to the indexing phase (there is no indexing phase!), and the search operation is O(n) and therefore slower than the naive index in the first example.

(Since you seem to be familiar with RDBMSes, consider a table comprised of simple datatypes, keeping blobs out of the picture -- would it be faster to search that table if it had indexes, or if it didn't? Using grep instead of a precompiled index is precisely identical to doing a full table scan instead of an index lookup.)
posted by mendel at 4:50 AM on April 8, 2005


« Older LSAT study guides   |   OS X-tative Newer »
This thread is closed to new comments.