Merging tables in mysql
February 2, 2009 11:36 AM   Subscribe

How do I merge three tables in mysql with duplicate entries for the primary key?

I want to merge tables 1,2, and 3.
Each one has a field called ID (the primary key) which is set to auto_increment.

I want to merge all three tables into one. However, I cannot merge because mysql coughs up a duplicate entry on the primary key.

I'm guessing that the solution would be to insert a new field into table two that is Total number of records in table 1 + ID (and then do a similar thing with table 3). After that I would delete the ID field and make the new field the primary key. How do I do this?

Is there an easier way?
posted by special-k to Computers & Internet (5 answers total) 2 users marked this as a favorite
 
Best answer: 1.) Select all the rows minus the current identity column into a new table.

create table foo (every, column, except, IDfield)

insert into foo ( every, column, except, idfield)

select every, column, except, idfield from
(
select ( every, column, except, IDfield) from table1 UNION ALL
select (every, column, except, idfield ) from table2 UNION ALL
select(every, column, except, idfield) from table3
) a

2.) add a new int identity column to the new table:

ALTER TABLE table_name ADD COLUMN id INTEGER AUTO_INCREMENT UNIQUE KEY
posted by spatula at 11:44 AM on February 2, 2009


Assuming you have no FKs or any other reasons you need your update to cascade to other tables...

Table T1 with Columns ID, C1, C2, C3.
Table T2 with Columns ID, C1, C2, C3.
Table T3 with Columns ID, C1, C2, C3.

You want to merge into T1.

INSERT INTO T1 (C1, C2, C3) SELECT C1, C2, C3 FROM T2
INSERT INTO T1 (C1, C2, C3) SELECT C1, C2, C3 FROM T3

That will just generate new keys for the records from T2 and T3 for you.

If you want to merge into a completely new table, just make it first and do 3 INSERT INTO ... SELECTs.
posted by cmm at 11:58 AM on February 2, 2009


If you want to retain the original identifier from your source tables, include an integer field that's non-identity, insert * from your source table into that.

If you need an identifier for the new table, well, then create that too. If you have to relate back to the source table, you've got the non-identity integer column. If you want to know which of the source tables it came from, I'd suggest shoving a label 'Table1','Table2','Table3' into another column called [SourceTable] on your insert.
posted by MatJ at 1:30 PM on February 2, 2009


Best answer: spatula's answer, where he specifies UNION ALL, is problematic.

If, disregarding the id, two rows from different source tales have exactly the same data in the same columns, should the merged table have one row for that data, or two rows? If the OP's answer is "one row", then "Union ALL" should be "Union distinct" (or just "union", as distinct is the default when neither is specified).

We can also make the process a bit simpler than spatula proposes:

create table merged like source_table_a;
-- merged now has the exact same structure as source_table_a, including an auto_increment id, but not data

insert into merged (cola, colb, colc)
-- cola, etc is simply the list of all columns in the table except the id
select cola, colb, colc from source_table_a
-- cola, etc is simply the list of all columns in the table except the id
union select cola, colb, colc from source_table_b
union select cola, colb, colc from source_table_c;

Note there's one problem here: we're losing the old id values, which are presumably used in other tables to link to the data in your three source tables. if that's the case, you have a larger pronlem here than merely merging.
posted by orthogonality at 3:24 PM on February 2, 2009 [1 favorite]


Since the OP was already talking about losing the old ids as a possible solution, I assumed it was fine to rekey the table.

If you copy all the records across you can always dedupe them later with a query like:

delete a
from mergetable a left join
( select min(id) id, mergecriteriacolumn1, mergecriteriacolumn2 ...
from mergetable
group by mergecriteriacolumn1, mergecriteriacolumn2... ) b
on a.id = b.id
where b.id is null



Remember you try and union distinct you could potentially lose information (it could be important to know if record FooBar was in both table 1 and table2)

Also the Union Distinct only dedupes across ALL contained fields. Using a query like above you can more finely tune how you dedupe the recordset based only on select columns, for example if you had

fname/lname/add1/zip

John Smith 123 rofl st 99999
John Smith 123 lollerskates ave 99999
John Smith 123 roflmao ct 99999
John Smith 123 omg blvd 99999

depending on your logic, you might only want 1 record per john smith, however if you relied on UNION DISTINCT since the add1s are different, you would get 4 records in your merge table.
posted by spatula at 5:11 PM on February 2, 2009


« Older Should have stuck to eating paste.   |   What song was this? Newer »
This thread is closed to new comments.