SQL Query to find unique combinations
March 14, 2012 1:42 PM   Subscribe

SQLNoobFilter: I am trying to find all unique combinations on a table. Does my simple query do the trick or is there more to it?

For the purpose of this question, say I have a simple table with two variables, Letter and Name. I am trying to output the unique combinations of both Letter and Name so I can have one instance for every combination that is on the table.

My code is as follows: select distinct(letter),name from TableA

Will this get me to the output desired on the bottom?

Example TableA:
Letter Name
A Bob
A Larry
A Bob
B Bob
B Larry
C Larry
D Larry
D Larry

Output:
Letter Name
A Bob
A Larry
B Bob
B Larry
C Larry
D Larry

Thanks in advance -
posted by amazingstill to Computers & Internet (5 answers total)
 
I'd write it as follows:

SELECT DISTINCT Letter, Name FROM TableA , but yeah, you've got it.
posted by deezil at 1:48 PM on March 14, 2012 [1 favorite]


Seconding deezil,

select distinct letter, name from tablea
posted by teragram at 2:56 PM on March 14, 2012


Distinct will work. A better way which opens up more options requires you to have an identity/unique identifier on the field, call it id:

id letter name
1 A Bob
2 A Larry
3 A Bob
4 B Bob
5 B Larry
6 C Larry
7 D Larry
8 D Larry

Then make a subquery taking the min(id) grouped on whatever you want to dedupe on, and left join it back to the main table:

select a.*, b.minid
from TableA left join ( select min(id) minid, Letter, Name from TableA group by Letter, Name) MinTableA
on TableA.id = MinTableA.minid

Results will be:
1 A Bob 1
2 A Larry 2
3 A Bob NULL
4 B Bob 4
5 B Larry 5
6 C Larry 6
7 D Larry 7
8 D Larry NULL

If you just want the uniques, only pick where it matches to the min:

select a.*, b.minid
from TableA left join ( select min(id) minid, Letter, Name from TableA group by Letter, Name) MinTableA
on TableA.id = MinTableA.minid
where MinTableA.minid is not null

This approach lets you isolate the exact records that are duplicates if you want to clean them up or report on them, they are the records on TableA where the id is not equal to the MinID.
posted by spatula at 4:33 PM on March 14, 2012


A better way which opens up more options

I don't know... This takes longer to write, is harder to read, and performs (much) worse, and for what? It doesn't solve the problem at hand any better.

One thing I've learned writing SQL is to keep it simple: Solve the problem you set out to solve and move on. More just causes headaches. (In this particular case, for example, if you wanted to select or delete duplicates, there are better ways; laying the groundwork like this wouldn't pay off.)


What the OP wants here is simply, as stated above:
SELECT DISTINCT
    Letter,
    Name
FROM TableA;
The real reason I stopped to comment is to clarify what DISTINCT is doing here. It's modifying SELECT, not Letter, and this is important to understand: "select distinct(letter),name from TableA" suggests to me that this may not have been clear.

Also, if you want the output nicely sorted like in the question, use
SELECT DISTINCT
    Letter,
    Name
FROM TableA
ORDER BY Letter, Name;

posted by kprincehouse at 11:21 PM on March 14, 2012


Response by poster: Thanks everyone - and you are right kprincehouse, my thinking was that distinct modifies the variable after it. I am going to have to read some more on the net. Thanks.
posted by amazingstill at 5:39 AM on March 15, 2012


« Older Cheap sailing in Chicago   |   Forced out Newer »
This thread is closed to new comments.