Join 3,421 readers in helping fund MetaFilter (Hide)


SQL help, LIKE %(set of possible values)%
April 14, 2010 5:20 PM   Subscribe

SQL problem. I need to search one table based on the results of a search on another table. The trick? The field I'm comparing on the two tables will not be exactly equal, one table will have a field with single 'word', the other table will have a field with have a string of multiple 'words'. Look inside for an example.

Two tables, FOOD and PEOPLE.

What I want is to return the NAME of all PEOPLE that have a COMMENT that contains a FOOT of TYPE 'fruit'

Table FOOD
FOOD TYPE
apple fruit
banana fruit
carrot vegetable

Table PEOPLE
NAME COMMENT
zackary apples are my favourite food
yvonne monkeys enjoy bananas
xavier melons are delicious

In this example, what I need returned is {zackary, yvonne}

How can I do this without writing multiple individual LIKE statements?

Can I somehow do a LIKE %(set of possible values)% statement?

My actual data has hundreds of rows in the 'FOOD' table.
posted by thisisnotbruce to Computers & Internet (13 answers total)
 
What DBMS?
posted by miyabo at 5:26 PM on April 14, 2010


I've been out of the SQL game for a couple of years but I'm 99% positive that a LIKE will not work against multiple values or a dataset that returns more than one value. At least in Oracle it won't.

One cheaty and nasty way of generating the multiple LIKE statemenrs you need is to run

SELECT DISTINCT 'Where comment like '||food||' or' from food

and then copy and paste the result set into your query. Like i said it's rough as fuck, but if the food table is not going to change then you only need to do it once. Although, there may be a limit on the number of OR you can have!

Good luck!
posted by jontyjago at 5:41 PM on April 14, 2010


With MS SQL at least, you can use LIKE in a join condition. So what you do is write a query that returns one row for each possible value (so you don't need LIKE to match multiple values, just one at a time). Then grab the unique list of user names from that.
select distinct name
from people
join food
on people.comment like '%'+food.food+'%'
where food.type='fruit'
I don't know which other SQL servers you can do this with, but there are other variations of this that should work on just about anything.
posted by FishBike at 5:44 PM on April 14, 2010


You're kinda not supposed to do complicated text searches in SQL. It's be better to used an external script to extract the foods in each person's comment, or write a script that runs within your DBMS' scripting system. That said, if you've got to do it, you may have some luck with an SQL extension that includes regular expressions, or (in Postgres) the list data type. That's why I asked what DBMS you're using.
posted by miyabo at 5:47 PM on April 14, 2010


There's almost certainly a more elegant way (probably with correlated subqueries), but quick and dirty, something like this should do it:

SELECT distinct
PEOPLE.NAME
FROM
PEOPLE,
FOOD
WHERE
FOOD.TYPE = 'fruit' AND
charindex(FOOD.FOOD, PEOPLE.COMMENT) > 0

SQL Server syntax, YMMV, assumes FOOD.FOOD and PEOPLE.COMMENTS are both all lowercase
posted by superna at 5:55 PM on April 14, 2010


FishBike is what I used before moving to Lucene for queries like this. If you're running into a lot of these move to Lucene or Lucene.net as soon as you can. It makes these things so easy. It works pretty much right out of the box.
posted by geoff. at 5:55 PM on April 14, 2010


or, on preview, what FishBike said
posted by superna at 5:58 PM on April 14, 2010


This is easy in Postgres:

select people.name
from food join people on people.comment ~ food.name
where food.type = 'fruit';

posted by nicwolff at 6:06 PM on April 14, 2010


I think what might be helpful in your mystery DBMS is apply a FULLTEXT index (syntax will be different for each DBMS) to column COMMENT and then use whatever the native syntax is to search using an OR for the multiple fruit strings. String manipulation in SQL is painful at best, and the multiple wildcard LIKEs will be slow as molasses.

If you try to do basic SQL on this it will fail miserably due to stemming issues (e.g. LIKE '%cherry%' won't match 'I like cherries'.
posted by benzenedream at 6:08 PM on April 14, 2010


Thanks for everybody's suggestions!

We're running Oracle, sorry, I obviously should have specified that in the original post..

A little more context: this was done to troubleshoot the extent of an application issue we were having; how many and which records were impacted is what I was trying to find out.

What I ended up doing is exporting the two tables via SQL and massaging the data in Excel to get at the results (using a combination of FIND, INDEX, and VLOOKUP for those interested).

I was more curious as to whether there was a quicker way to do this using just SQL in case I needed to do something similar in the future.

Marking a whole of of best answers. Thanks again, everyone!
posted by thisisnotbruce at 7:20 PM on April 14, 2010


And yes, I agree that it is not ideal to run this type of SQL query on a regular basis. But as a one-time ad-hoc method of checking something? ... meh
posted by thisisnotbruce at 7:25 PM on April 14, 2010


Come on people, this is simple. There's no need for Lucerne or full text searches.

Basically, do what FishBike said, but do it all at once for all foods:
select distinct a.food, b.name
from food a
join people b on ( b.comment like '%'+a.food+'%');
This gives one row per ( food, person). The distinct prevents duplicate rows, where one person mentioned the same food in two or more comments.

If you want to make rows which contain one food and a list of people who mentioned it, there are some really stupid Oracle tricks to do this.

I suspect this seems hard if you think that joins must always be eqi-joins. A join can be predicate that evaluates to a boolean.
posted by orthogonality at 8:06 PM on April 14, 2010 [3 favorites]


orthogonality has it
posted by MatJ at 4:44 AM on April 15, 2010


« Older I have about 600, 11x17 pages ...   |  I'm looking for fun, poppy, ha... Newer »
This thread is closed to new comments.