MySQL joins
August 21, 2008 9:58 PM Subscribe
I have a problem joining three tables in MYSQL.
I have three tables I'd like to join and I can't quite get the SQL syntax right. Table 1 has two columns of numbers. I'd
like to join one of these columns with the same column in table 2, and the other column with its match in table 3.
So you could say there are two primary keys perhaps?
Could any SQL expert give me the syntax to return a list of all entries in table 1 with either a match in table 2 or 3?
Obviously I'm a bit of a beginner at this ;)
I have three tables I'd like to join and I can't quite get the SQL syntax right. Table 1 has two columns of numbers. I'd
like to join one of these columns with the same column in table 2, and the other column with its match in table 3.
So you could say there are two primary keys perhaps?
Could any SQL expert give me the syntax to return a list of all entries in table 1 with either a match in table 2 or 3?
Obviously I'm a bit of a beginner at this ;)
Assuming that you mean "either a match in table 2 or 3 or both":
select t1.x, t1.y from t1, t2, t3 where (t1.x = t2.x) or (t1.y = t3.y);
If you mean "... and not both", then:
select t1.x, t1.y from t1, t2, t3 where ((t1.x = t2.x) or (t1.y = t3.y)) and ((t1.x != t2.x) or (t1.y != t3.y));
I make no guarantees that these are optimally efficient or even near so.
posted by Flunkie at 10:20 PM on August 21, 2008
select t1.x, t1.y from t1, t2, t3 where (t1.x = t2.x) or (t1.y = t3.y);
If you mean "... and not both", then:
select t1.x, t1.y from t1, t2, t3 where ((t1.x = t2.x) or (t1.y = t3.y)) and ((t1.x != t2.x) or (t1.y != t3.y));
I make no guarantees that these are optimally efficient or even near so.
posted by Flunkie at 10:20 PM on August 21, 2008
I'd like to help, but I'm a bit confused. Can you describe the tables involved? For example, something like:
And then can you indicate the names of the columns you're trying to join (and under what conditions, like
posted by Blazecock Pileon at 10:24 PM on August 21, 2008
tableA
----------
id (INT, PRIMARY KEY)
values (INT)
...
tableB
----------
id (INT, PRIMARY KEY)
tableC_id (INT)
...
tableC
----------
id (INT, PRIMARY KEY)
...
And then can you indicate the names of the columns you're trying to join (and under what conditions, like
tableB.tableC_id = tableC.id
, for example)?posted by Blazecock Pileon at 10:24 PM on August 21, 2008
"all entries in table"
This means an outer join. People tend to do outer joins as left outer joins, because that's what most databases support, because people tend to do left outer joins, because.... So we'll do left outer joins, to table2 and to table3. I'll give the tables correlation names or aliases of "a", "b", and "c" to minimize my typing.
Please think of a join as an operator, because it is. Just as I can write 2 + 4 + 8, which means (2 + 4) + 8, I can write relation join relation join relation, which means (relation join relation) join relation, where relation is some table or view. Note that the second join is joining the result of the first join to the third relation.
"I'd like to join one of these columns with the same column in table 2, and the other column with its match in table 3. "
To do that, we will also qualify each join with a predicate, in the form on (predicate). By match, you mean an equality predicate, so we'll use equal. Of course you know that null == null is false, so if a column's value is null, we won't be matching on that.
All of that, to produce this quite simple query:
select * from
table1 a left outer join table2 b on (a.some_col = b.some_col)
left outer join table3 c on (a.some_other_col = c.some_other_col);
posted by orthogonality at 10:28 PM on August 21, 2008
This means an outer join. People tend to do outer joins as left outer joins, because that's what most databases support, because people tend to do left outer joins, because.... So we'll do left outer joins, to table2 and to table3. I'll give the tables correlation names or aliases of "a", "b", and "c" to minimize my typing.
Please think of a join as an operator, because it is. Just as I can write 2 + 4 + 8, which means (2 + 4) + 8, I can write relation join relation join relation, which means (relation join relation) join relation, where relation is some table or view. Note that the second join is joining the result of the first join to the third relation.
"I'd like to join one of these columns with the same column in table 2, and the other column with its match in table 3. "
To do that, we will also qualify each join with a predicate, in the form on (predicate). By match, you mean an equality predicate, so we'll use equal. Of course you know that null == null is false, so if a column's value is null, we won't be matching on that.
All of that, to produce this quite simple query:
select * from
table1 a left outer join table2 b on (a.some_col = b.some_col)
left outer join table3 c on (a.some_other_col = c.some_other_col);
posted by orthogonality at 10:28 PM on August 21, 2008
What your'e asking is confusing. I think you need to be more clear.
The most confusing part is that you want to get all items that have a match on EITHER table 2 OR table 3... I mean.. this is possible.. but why do you want to do this? Are you sure your database is designed correctly? A real world example would help.
From the "Teach a man to fish" department -- you should read about the difference between RIGHT JOIN, LEFT JOIN and INNER JOIN ... therein lies your answer...
posted by twiggy at 10:33 PM on August 21, 2008
The most confusing part is that you want to get all items that have a match on EITHER table 2 OR table 3... I mean.. this is possible.. but why do you want to do this? Are you sure your database is designed correctly? A real world example would help.
From the "Teach a man to fish" department -- you should read about the difference between RIGHT JOIN, LEFT JOIN and INNER JOIN ... therein lies your answer...
posted by twiggy at 10:33 PM on August 21, 2008
Oh. Either table2 or table3? That's a full join. Use that if your database supports it. If not, use a union (over table1) of the two joins I showed you.
select a.* from
table1 a left outer join table2 b on (a.some_col = b.some_col)
union
select a.* from
table1 left outer join table3 c on (a.some_other_col = c.some_other_col);
-- yes, I can legally reuse the correlation name "a", as the name's scope doesn't extend past the union.
And flunkie, take a closer look at your where clause.
posted by orthogonality at 10:35 PM on August 21, 2008
select a.* from
table1 a left outer join table2 b on (a.some_col = b.some_col)
union
select a.* from
table1 left outer join table3 c on (a.some_other_col = c.some_other_col);
-- yes, I can legally reuse the correlation name "a", as the name's scope doesn't extend past the union.
And flunkie, take a closer look at your where clause.
posted by orthogonality at 10:35 PM on August 21, 2008
And flunkie, take a closer look at your where clause.Which one, and what about it?
posted by Flunkie at 10:39 PM on August 21, 2008
Er, and I should re-read my last comment; in my second comment, take off the "left outer", to get this:
select a.* from
table1 a join table2 b on (a.some_col = b.some_col)
union
select a.* from
table1 join table3 c on (a.some_other_col = c.some_other_col);
MySQL doesn't do full joins, even if you're clever and try to do it like flunkie did by filtering a cross-join. (Yes, that should work (flunkie's incorrect predicate aside), but it's MySQL, and so it doesn't work, at least through MySQL 5.0.)
So in MySQL you have to do the union.
As Twiggy points out, this has the smell of bad table design.
posted by orthogonality at 10:43 PM on August 21, 2008
select a.* from
table1 a join table2 b on (a.some_col = b.some_col)
union
select a.* from
table1 join table3 c on (a.some_other_col = c.some_other_col);
MySQL doesn't do full joins, even if you're clever and try to do it like flunkie did by filtering a cross-join. (Yes, that should work (flunkie's incorrect predicate aside), but it's MySQL, and so it doesn't work, at least through MySQL 5.0.)
So in MySQL you have to do the union.
As Twiggy points out, this has the smell of bad table design.
posted by orthogonality at 10:43 PM on August 21, 2008
Could you please explicitly say what you believe is wrong with my answer?
posted by Flunkie at 10:44 PM on August 21, 2008
posted by Flunkie at 10:44 PM on August 21, 2008
Flunkie writes "Which one, and what about it?"
My mistake; it's late and I'm tired.
posted by orthogonality at 10:47 PM on August 21, 2008
My mistake; it's late and I'm tired.
posted by orthogonality at 10:47 PM on August 21, 2008
Flunkie writes "Could you please explicitly say what you believe is wrong with my answer?"
It's correct SQL but won't work under MySQL, because MySQL won't do full outer joins even "disguised" as filtered cross-joins.
posted by orthogonality at 10:49 PM on August 21, 2008
It's correct SQL but won't work under MySQL, because MySQL won't do full outer joins even "disguised" as filtered cross-joins.
posted by orthogonality at 10:49 PM on August 21, 2008
Response by poster: I'll try some of these this morning and report back.
To give a real world example of its application, table1 is a master list of webpages, each with a page ID (table1) and a version ID (table2). Table2 is a whole bunch of information about the versions, which of course match with the version ID, and table3 is just a bookmark on the page, which matches with the page ID.
I'm not sure if this is bad site design...it would seem to be a pretty common scenario?
Thank you very much for your detailed answers.
posted by dydecker at 5:34 AM on August 22, 2008
To give a real world example of its application, table1 is a master list of webpages, each with a page ID (table1) and a version ID (table2). Table2 is a whole bunch of information about the versions, which of course match with the version ID, and table3 is just a bookmark on the page, which matches with the page ID.
I'm not sure if this is bad site design...it would seem to be a pretty common scenario?
Thank you very much for your detailed answers.
posted by dydecker at 5:34 AM on August 22, 2008
Response by poster: Sorry, that should have read "each with a page ID (column1) and a version ID (column2)"...
posted by dydecker at 5:37 AM on August 22, 2008
posted by dydecker at 5:37 AM on August 22, 2008
OK, so what you have is three tables:
• "bookmarks" has the bookmark data with a primary key "bookmark_id" and a foreign key "page_id" into the "pages" table
• "pages" has Web page records, each with one primary key "page_id" and one foreign key "version_id" into the "versions" table
• "versions" has the primary key "version_id" version metadata - or, if you want to be able to see old page versions like in a Wiki, this is where the actual page HTML is
So if you are given a bookmark ID and want the page and version your query becomes:
posted by nicwolff at 7:34 AM on August 22, 2008
• "bookmarks" has the bookmark data with a primary key "bookmark_id" and a foreign key "page_id" into the "pages" table
• "pages" has Web page records, each with one primary key "page_id" and one foreign key "version_id" into the "versions" table
• "versions" has the primary key "version_id" version metadata - or, if you want to be able to see old page versions like in a Wiki, this is where the actual page HTML is
So if you are given a bookmark ID and want the page and version your query becomes:
select * from bookmarks join pages using (page_id) left join versions using (version_id) where bookmark_id = ?
. That will return an empty result table if the bookmark has no page but will return pages that have no associated version records.posted by nicwolff at 7:34 AM on August 22, 2008
Sorry, I meant:
"versions" has the primary key "version_id" and the version metadata
posted by nicwolff at 7:35 AM on August 22, 2008
"versions" has the primary key "version_id" and the version metadata
posted by nicwolff at 7:35 AM on August 22, 2008
(And here's one for free: if you use the same name for the primary and foreign keys - for example, if you call the primary key for the "pages" table "page_id" and then use the same name for the foreign key in the "bookmarks" table - you can join them with the abbreviated syntax
posted by nicwolff at 7:46 AM on August 22, 2008
select * from bookmarks natural join pages
. This is very elegant and handy but DO NOT USE IT. Later you or another developer will add a second column to one of those tables with the same name as one in the other, and the "natural join" will suddenly return no rows, which will cost someone hours of debugging. NATURAL JOIN is a bad monkey!)posted by nicwolff at 7:46 AM on August 22, 2008
Response by poster: Okay, after much investigation, the secret seems to be...two left joins.
SELECT table1.col_1, table1.col_2
FROM table1
LEFT JOIN table2 ON table2.col_1 = table.col_2
LEFT JOIN table3 ON table3.col_1 = table1.col_1
Translated into my case:
SELECT masterlist.pageID, masterlist.versionID
FROM masterlist
LEFT JOIN versionlist ON versionlist.versionID = masterlist.versionID
LEFT JOIN bookmark ON bookmark.pageID = masterlist.pageID
This seems to return all pages from the masterlist which have either a bookmark or are the latest version of a page...which is what I need.
Thanks to all
posted by dydecker at 7:48 AM on August 22, 2008
SELECT table1.col_1, table1.col_2
FROM table1
LEFT JOIN table2 ON table2.col_1 = table.col_2
LEFT JOIN table3 ON table3.col_1 = table1.col_1
Translated into my case:
SELECT masterlist.pageID, masterlist.versionID
FROM masterlist
LEFT JOIN versionlist ON versionlist.versionID = masterlist.versionID
LEFT JOIN bookmark ON bookmark.pageID = masterlist.pageID
This seems to return all pages from the masterlist which have either a bookmark or are the latest version of a page...which is what I need.
Thanks to all
posted by dydecker at 7:48 AM on August 22, 2008
For future reference with regards to the types of joins, this visual representation is a handy reference:
http://www.codinghorror.com/blog/archives/000976.html
I always find it by googling visual joins.
posted by ChefQuix at 8:14 AM on August 22, 2008
http://www.codinghorror.com/blog/archives/000976.html
I always find it by googling visual joins.
posted by ChefQuix at 8:14 AM on August 22, 2008
NATURAL JOIN is a bad monkey!
Jesus fuck that's some bad queryin'.
MySQL doesn't do full joins
One more reason to hate MySQL. I simply can't understand why anyone would use MySQL when there's a perfectly good open-source transactional DB available (rhymes with "ROAST PRESS").
all pages from the masterlist which have either a bookmark or are the latest version of a page
Just FYI: left joins are dangerous and, I dare say, bad. It seems to me that there should be a better way to do this query if the above criteria is what you're looking for. Without fully understanding the schema, it looks like you're really trying to do a UNION. Ortho is usually really good as these kinds of things--I think with a better explanation of your table structure you could get a crazy-fast query. Then again, if you're only talking about a couple thousand rows, maybe it doesn't matter. But just doing two left joins like above means you're getting duplicates (unless you group by masterlist.pageID).
posted by Civil_Disobedient at 11:35 PM on August 22, 2008
Jesus fuck that's some bad queryin'.
MySQL doesn't do full joins
One more reason to hate MySQL. I simply can't understand why anyone would use MySQL when there's a perfectly good open-source transactional DB available (rhymes with "ROAST PRESS").
all pages from the masterlist which have either a bookmark or are the latest version of a page
Just FYI: left joins are dangerous and, I dare say, bad. It seems to me that there should be a better way to do this query if the above criteria is what you're looking for. Without fully understanding the schema, it looks like you're really trying to do a UNION. Ortho is usually really good as these kinds of things--I think with a better explanation of your table structure you could get a crazy-fast query. Then again, if you're only talking about a couple thousand rows, maybe it doesn't matter. But just doing two left joins like above means you're getting duplicates (unless you group by masterlist.pageID).
posted by Civil_Disobedient at 11:35 PM on August 22, 2008
This thread is closed to new comments.
posted by dydecker at 9:59 PM on August 21, 2008