Oracle Text (CTX) - Front end for thesaurus queries
June 9, 2008 6:46 AM
My company system has an application that uses Oracle Text for querying large text fields in a natural query language -- I'm a big fan. Oracle Text supports the use of structured thesauri for queries too. This works well but I'm not sure what the best way to implement this is.
I can perform a query using this hierarchy thus: $RELATION_OPERATOR($SEARCH_STRING, $THESAURUS_NAME, %DEPTH_OF_SEARCH)
E.g. NT(Felines,Animals,2) searches for text within which the word "felines" appears or any words that are up to two levels beneath "felines" in the hierarchy of the animals thesaurus. Tigers, Lions etc.
I'm mulling over the easiest way to present this functionality to users. I'm concerned that entering search terms such as NT(Felines,Animals,2) OR BT(Tabby,Animals,1) is overcomplicated for the user. But I think a GUI based query builder would be even worse (and be too restrictive).
Has anyone used Oracle Text thesauri and devised a simpler front end? Note that need to be able to query using multiple thesauri so I can't omit the thesaurus term and rely on default.
I can perform a query using this hierarchy thus: $RELATION_OPERATOR($SEARCH_STRING, $THESAURUS_NAME, %DEPTH_OF_SEARCH)
E.g. NT(Felines,Animals,2) searches for text within which the word "felines" appears or any words that are up to two levels beneath "felines" in the hierarchy of the animals thesaurus. Tigers, Lions etc.
I'm mulling over the easiest way to present this functionality to users. I'm concerned that entering search terms such as NT(Felines,Animals,2) OR BT(Tabby,Animals,1) is overcomplicated for the user. But I think a GUI based query builder would be even worse (and be too restrictive).
Has anyone used Oracle Text thesauri and devised a simpler front end? Note that need to be able to query using multiple thesauri so I can't omit the thesaurus term and rely on default.
Thanks for the answer southof40. As standalone suggestions they're very good although I'm reticent to go the route of using dedicated controls / fields since this would hamper using the full extent of the natural query language. E.g.
(Eleph% OR $Giraffe) NEAR NT(Felines,Animals,2)
It's not unworkable and I initially mocked something up like that but I'm hoping to maybe go with a single text field (a la Google) with perhaps some clever parsing to allow the user to enter their terms in simpler terms... Still not sure!
Cheers.
posted by NailsTheCat at 2:29 AM on June 10, 2008
(Eleph% OR $Giraffe) NEAR NT(Felines,Animals,2)
It's not unworkable and I initially mocked something up like that but I'm hoping to maybe go with a single text field (a la Google) with perhaps some clever parsing to allow the user to enter their terms in simpler terms... Still not sure!
Cheers.
posted by NailsTheCat at 2:29 AM on June 10, 2008
This thread is closed to new comments.
Option 1: For users who don't really know/care how they're getting their results I would be inclined to first search the thesaurus to find where in the hierarchy you find instances of 'felines'. Then step up the hierarchy some fixed distance (say 2 levels) and use that as your second argument. Allow the third argument to be determined by the user on the basis of 'Narrow/Wide focus' but default it to something reasonable. Of course the results presented to the user would consist of 0-n result sets (as 'felines' might appear in > 1 place in the thesaurus hierarchy) and you'd have to think about how to reveal (or hide) that fact from them.
Doing it this way the user doesn't have to work out that the string they're looking for 'belongs' in Animals and they still get a 'fuzzy' search around the idea of 'Felines' and so setting up the 'Fuzzy Search' becomes a question of supplying a string and optionally providing a Narrow/Wide value (I visualise a single text box and a slider ?).
Option 2: For user who are 'knowledge specialists' just provide three text boxes : search string in first; 'Animals' type value in second; and a drop down with a distance value in third. This seems fine (and better than Option 1) for users who are familiar with thesauri and are used to doing complex searches in other tools.
posted by southof40 at 5:46 PM on June 9, 2008