Help solve my SQL mystery
February 19, 2009 11:32 AM   Subscribe

How can I get the IN function to work in my SQL?

There are two tables, CONTACT and QUEUE. CONTACT's columns are contactid and queuelist. QUEUE's columns are queueid and queuename.

Example CONTACT record
contactid = user1, queuelist = 'q1','q2'

Example QUEUE records
queueid = q1, queuename = Queue1
queueid = q2, queuename = Queue2
queueid = q3, queuename = Queue3

SQL
SELECT queuename
FROM QUEUE
WHERE queueid IN
(SELECT queuelist
FROM CONTACT
WHERE contactid = 'user1')

except this SQL doesn't produce Queue1 and Queue2 as the expected output and I can't figure out why. If I simply replace the select statement in the brackets with 'q1','q2' (the queuelist) it works.

Is there something fundamental I'm missing?
posted by urbanwhaleshark to Computers & Internet (12 answers total)
 
Bog-standard SQL doesn't have array values types (you're supposed to normalize your schema so you don't need them). What is the type of the queuelist column? And what database are you using?
posted by hattifattener at 11:36 AM on February 19, 2009


Response by poster: varchar2, Oracle 10g (I think)
posted by urbanwhaleshark at 11:37 AM on February 19, 2009


Best answer: IN doesn't work like that. When you write it as IN('q1','q2') that's two separate values. Your subquery returns only one column, so the parent query sees IN( "'q1','q2'" ), so it returns only rows where the queuename literally equals "'q2','q2'".

Do you have control over the schema? If so, you need an intersection table - like CONTACT_QUEUE_MEMBERSHIP with columns contactid and queueid.
posted by Electric Dragon at 11:50 AM on February 19, 2009


Best answer: Ah. (And now I see the Oracle in your tags.) The problem is that your sub-select is returning only a single object, and you're trying to find a queue whose name is 'q1','q2' (with the quotes and commas). I'd guess you could parse apart that string into multiple values using a stored procedure of some sort, but that's the hard way.

What I'd do is normalize CONTACT so that it only holds a single queue name per row. It'd contain, for example,

contactid = user1, queuelist = q1
contactid = user1, queuelist = q2

Then your sub-select will return two values, q1 and q2, and the IN will work as you expect.

Google tells me that Oracle does have an array column type, but doesn't tell me if there's an easy way to flatten a resultset-of-arrays-of-strings into a resultset-of-strings which is what you want.
posted by hattifattener at 11:50 AM on February 19, 2009


Best answer: 'IN' != 'LIKE'

The direct answer to what you want is (I think):

SELECT q.queuename
FROM QUEUE q
JOIN CONTACT c ON c.queuelist LIKE '%' + q.queueid + '%'


HOWEVER- What you're doing makes my RDBMS skin crawl. Storing CSV values in a single column defeats the whole reason you're using a relational database. You want a schema more or less like this:

CONTACT
======
ContactId


QUEUE
=====
QueueId


CONTACTQUEUE
==========
ContactId
QueueId

Then you just do JOINs to get the many-to-many relationships you need. MUCH more efficient.
posted by mkultra at 11:50 AM on February 19, 2009


this is suspect:

queuelist = 'q1','q2'

what does this mean? you have a string containing that text stored in a column in your contact record?
posted by stubby phillips at 11:52 AM on February 19, 2009


Response by poster: Hi everyone. Thanks for all your feedback. I kinda have a feeling I'm being lazy trying to store a list in a cell, but I was intrigued about the IN function because it seemed like it could save me a lot of time.

Here's the lowdown which might help put things in context.

I'm building a helpdesk application to allow people to faults. Ultimately when calls are logged with the system they are assigned a queue. A support contact can have access to n queues. I thought that by storing the queues to which they're allowed access as a list in the CONTACT table would be a neat shortcut.

I swear I'm not trying to use IN like LIKE. Honest :)

I'm pretty sure I came up with a schema similar to mkultra's sometime ago (before my workload became so intense I had to drop the project), but I've started anew and thought this would save me some work. But you know what they say about shortcuts creating more work than if you'd just sat down and thought about the route beforehand, before getting lost.

Again, thanks for the replies, sorry for making your skin crawl mkultra.
posted by urbanwhaleshark at 12:13 PM on February 19, 2009


If you don't have control over the schema you may have to write a table function to split the string into multiple rows.
posted by Electric Dragon at 12:16 PM on February 19, 2009


The list is interesting. The normal half-assed way to do this to guess the max # of queues one person could have, and then create that number of columns named queue1, queue2, queue3, etc. Then your contact would have queue1 = "q1" and queue2 = "q2".

Then your SQL would be:

SELECT queuename
FROM QUEUE
WHERE queueid IN
(SELECT queue1
FROM CONTACT
WHERE contactid = 'user1'
UNION
SELECT queue2
FROM CONTACT
WHERE contactid = 'user1'
)

But don't do it this way either.
posted by smackfu at 12:35 PM on February 19, 2009


I highly recommend the book MySQL Developer's Library by Paul DuBois.
posted by schrodycat at 6:17 PM on February 19, 2009


You're actually being the opposite of lazy, since parsing a list out of a cell will take more of your time and the database's time, even with indexing. Create a second table for queue items.
posted by mikeh at 6:55 PM on February 19, 2009


You are making a lot more work for yourself than you need to here. Since you're starting fresh there is no reason not to normalize the tables. Don't do this array-as-string stuff; you're just asking for trouble. I have some situations where it's unavoidable at work and it's the bane of my existence. Really, it sucks. If you do it this way, someone down the line is going to curse you for it.

What I would do is something like mkultra suggests; have a table that contains userids in one column and a queue ID in the other; then have another table that contains the queueids and queue names (if you really need to have the separate queue ID -> queue name abstraction).

If you do things this way, you can get all the queues that apply to a particular user ID with a simple SELECT query, and then you can get all the queue names that apply to that user with a nearly-as-simple JOIN or even just a subselect.* This doesn't require any string parsing at all, and it's all very basic SQL. You don't need to use any fancy array-column features; you could do it in any RDBMS you want, pretty much.
CONTACT
======
UserID    QueueID
--------  ---------
a12345    Q1
a12345    Q2
b34567    Q1

QUEUENAMES
==========
QueueID       QueueName
----------    ---------
Q1            SomeQueue
Q2            AnotherQueue
Quick and dirty way to get QueueNames for a UserID, with a sub-select:
SELECT QueueName FROM QUEUENAMES WHERE QueueID IN (SELECT DISTINCT QueueID FROM CONTACT WHERE UserID='a12345')

Very occasionally, storing an array as a string and stuffing it into a DB column is a necessary evil. This almost always occurs when you're shoehorning something into a database schema that was never designed to handle it, you don't have the ability to add tables or otherwise change the schema, and you have to get the job done yesterday. Even in those situations it's an ugly, sub-optimal hack. Unless there is some complication or subtlety that I'm just not seeing here, I don't know why you'd do it for new work.

* The JOIN method is probably the preferred approach if you were going to be doing this in volume, because I suspect it's more efficient. I've always been told that sub-selects should be avoided on large tables because they basically do the worst-case search method every time. This sort of performance stuff is getting out of my depth, though.
posted by Kadin2048 at 8:06 PM on February 19, 2009


« Older I'll give you something not to cry about   |   Preventative medicine for our Jeep? Newer »
This thread is closed to new comments.