How can I merge only the relevant results in SQL
February 21, 2013 6:29 AM Subscribe
I have a table in my database.
It has a bunch of 4 digit codes representing a location (as well as other data).
The problem is that some of these locations need to be replaced with an updated code. I have a second table with the substitutions that need to be made, but it is only a small minority of codes that need updating.
How do I format my SQL statement to produce me a list of all records, with the new location code if it has a new code and the old code if there is no update needed.
*BONUS ROUND*
There are actually two columns using these location codes (origin and destination) can I update both these columns at the same time in the same way?
*BONUS ROUND*
There are actually two columns using these location codes (origin and destination) can I update both these columns at the same time in the same way?
Note: I'm an Oracle programmer. Oracle, MySQL and SQLServer all have slightly different ways of handling subqueries. So YMMV, depending on platform.
posted by mneekadon at 6:44 AM on February 21, 2013 [1 favorite]
posted by mneekadon at 6:44 AM on February 21, 2013 [1 favorite]
Adapting from here, you could try this:
UPDATE myTable m SET locationCode = (SELECT newLocationCode FROM changesTable WHERE oldLocationCode = m.locationCode);
Caveat: I'm pretty novice when it comes to complex SQL. You may be able to do this for two columns by using SELECT a=(b), c=(d) instead of SET a=(b).
posted by hanoixan at 6:47 AM on February 21, 2013 [1 favorite]
UPDATE myTable m SET locationCode = (SELECT newLocationCode FROM changesTable WHERE oldLocationCode = m.locationCode);
Caveat: I'm pretty novice when it comes to complex SQL. You may be able to do this for two columns by using SELECT a=(b), c=(d) instead of SET a=(b).
posted by hanoixan at 6:47 AM on February 21, 2013 [1 favorite]
Response by poster: Minor additional data:
It is MS SQL Server
I want to not modify the original table, (although I guess I could just copy it and play with that..)
Oh and by format my SQL I basically meant uh.. write it.
Sorry, I'll be quiet now.
Thanks all so far.
posted by Just this guy, y'know at 6:51 AM on February 21, 2013
It is MS SQL Server
I want to not modify the original table, (although I guess I could just copy it and play with that..)
Oh and by format my SQL I basically meant uh.. write it.
Sorry, I'll be quiet now.
Thanks all so far.
posted by Just this guy, y'know at 6:51 AM on February 21, 2013
How is it that you determine which rows in the first table need updating? Is it just based on which code is used, e.g. all rows with the value "Ind." need to be replaced with "IN" in the way that the US Post Office changed state abbreviations in 1987^, and the second table contains the mappings between the old and new codes?
Note that hanoixan's suggestion of using an UPDATE command is what you'd do to change the data in the table rather than just return a list from a query.
posted by XMLicious at 6:52 AM on February 21, 2013 [1 favorite]
Note that hanoixan's suggestion of using an UPDATE command is what you'd do to change the data in the table rather than just return a list from a query.
posted by XMLicious at 6:52 AM on February 21, 2013 [1 favorite]
Yes. The part inside the parens is the subquery. Have my instructor hat on when I should have my work hat on, sorry.
StackOverflow answers all tech questions, eventually.
posted by mneekadon at 6:52 AM on February 21, 2013 [1 favorite]
StackOverflow answers all tech questions, eventually.
posted by mneekadon at 6:52 AM on February 21, 2013 [1 favorite]
I did this yesterday! Normally, yes, MERGE is the way to go. I had to run this on SQL Server 2000, which doesn't exactly support MERGE, so here's my code in case it helps.
[Products] is the original table, [Updates] contains only the rows that need updating.
posted by Nonsteroidal Anti-Inflammatory Drug at 6:55 AM on February 21, 2013
[Products] is the original table, [Updates] contains only the rows that need updating.
update [PRODUCTS]
SET
[Species] = u.[Species]
,[Color] = u.[Color]
,[Store] = u.[Store]
,[About] = u.[About]
FROM [Updates] AS u
inner join [PRODUCTS]
ON [PRODUCTS].[sku] = u.[sku]
posted by Nonsteroidal Anti-Inflammatory Drug at 6:55 AM on February 21, 2013
On lack of preview...
I want to not modify the original table
Do you mean you just want to SELECT from the two tables and display the relevant information without changing anything?
posted by Nonsteroidal Anti-Inflammatory Drug at 6:58 AM on February 21, 2013
I want to not modify the original table
Do you mean you just want to SELECT from the two tables and display the relevant information without changing anything?
posted by Nonsteroidal Anti-Inflammatory Drug at 6:58 AM on February 21, 2013
Ah, missed the part about it being a query and not an update.
You do an outer join for that.
select [fields you want except the lookup] from original table, ifnull(replacement value, original value) (not sure the MS syntax for use this one if that one is null)
left outer join lookup table
on original table.field = lookuptable.field
something like that.
To do two outer joins (one for each field you're looking up) I think you have to alias the original table.
If I were doing this all the time, I'd just write a stored proc to look it up and call that once for each value.
posted by mneekadon at 7:04 AM on February 21, 2013
You do an outer join for that.
select [fields you want except the lookup] from original table, ifnull(replacement value, original value) (not sure the MS syntax for use this one if that one is null)
left outer join lookup table
on original table.field = lookuptable.field
something like that.
To do two outer joins (one for each field you're looking up) I think you have to alias the original table.
If I were doing this all the time, I'd just write a stored proc to look it up and call that once for each value.
posted by mneekadon at 7:04 AM on February 21, 2013
Response by poster: In answer to questions asked:
Some example date:
Main Table:
OriginCode | Destination Code | Other data
1234 2546 Blah
5489 1235 Blah
8479 5479 Blah
Other Table:
OLD | NEW
1234 1237
1235 5479
So I would merge the OriginCode with the Other table and where there was a 1234 it would update to 1237.
But the other two rows (not having entries in the other table) would remain the same.
Does that make sense?
Also, yes I want to leave the original table alone and display the relevant data back.
Again, all the thanks so far.
This is something I ought to know how to do, but I'm completely blanking on it.
posted by Just this guy, y'know at 7:04 AM on February 21, 2013
Some example date:
Main Table:
OriginCode | Destination Code | Other data
1234 2546 Blah
5489 1235 Blah
8479 5479 Blah
Other Table:
OLD | NEW
1234 1237
1235 5479
So I would merge the OriginCode with the Other table and where there was a 1234 it would update to 1237.
But the other two rows (not having entries in the other table) would remain the same.
Does that make sense?
Also, yes I want to leave the original table alone and display the relevant data back.
Again, all the thanks so far.
This is something I ought to know how to do, but I'm completely blanking on it.
posted by Just this guy, y'know at 7:04 AM on February 21, 2013
Best answer: Ah, gotcha.
Then you want something like mneekoadon said. I tested this out with your sample data on SQL Server 2000:
Basically saying, bring in the [Changes] data if it matches, and then show the original data if [Changes] doesn't have anything (is null).
That results in this output:
posted by Nonsteroidal Anti-Inflammatory Drug at 7:16 AM on February 21, 2013 [3 favorites]
Then you want something like mneekoadon said. I tested this out with your sample data on SQL Server 2000:
SELECT ISNULL([OChanges].[New],[Origin]) as [Origin]
,ISNULL([DChanges].[New],[Origin]) as [Destination]
,[Other]
FROM [Main]
Left Outer Join [Changes] as [OChanges]
on [Origin] = [OChanges].[Old]
Left Outer Join [Changes] as [DChanges]
on [Destination] = [DChanges].[Old]
Basically saying, bring in the [Changes] data if it matches, and then show the original data if [Changes] doesn't have anything (is null).
That results in this output:
Origin Destination Other
-----------------------------------------
1237 1234 Blah
5489 5479 Blah 2
8479 8479 Blah 3
posted by Nonsteroidal Anti-Inflammatory Drug at 7:16 AM on February 21, 2013 [3 favorites]
Response by poster: Hooray, that appears to have worked perfectly.
Thank you all so much.
Everyone gets a pony (If you want a pony).
posted by Just this guy, y'know at 7:51 AM on February 21, 2013
Thank you all so much.
Everyone gets a pony (If you want a pony).
posted by Just this guy, y'know at 7:51 AM on February 21, 2013
This thread is closed to new comments.
You could do both updates at the same time but it'll be quicker to use two statements than to do it in one. Just use the same statement and switch out the field name of the origin with the field name of the destination.
posted by mneekadon at 6:42 AM on February 21, 2013