Best practice in implementing a two-part unique identifier?
May 9, 2011 9:52 AM   Subscribe

Best way to implement a two-part unique identifier in Access?

I've been tasked with building a solution that involves items with a two-part identifier that must be unique. Imagine the identifiers as letters and numbers. Each part can have multiple entries with the same values, but when put together, they must be unique.

AA01
AA02
BB01
BB02
etc.

I could make [LetterPart] and [NumberPart] together the primary key for the table. But I'd love to have a primary key that is just one entry, and still have the database reject any attempt to insert a non-unique [LetterPart] and [NumberPart] pair.

I'd also love for a way to have the database automatically increase the number part in a way that is aware of the letter part, so I don't have to do it in code. So I could tell it to Insert new AA, and it looks at the AA entries, sees the last one was 02, and creates AA03. Then when I tell it to insert CC, for example, it creats CC01.

Are these things possible? What are the known, sane approaches to this problem?
posted by jsturgill to Computers & Internet (14 answers total) 1 user marked this as a favorite
 
I suggest having a table that lists each letter part once and then lists the highest number used for that letter already. When new data is added, that table needs to be updated with the new highest number.

The CONCATENATE function will string both field values together into one value when you need them to be in the same column.
posted by soelo at 10:32 AM on May 9, 2011


Some light reading on Access and composite keys.
posted by rhizome at 10:51 AM on May 9, 2011


Is there any meaning to the letter prefix?

If there is then this is breaking a basic rule of relational design. If the prefix 'BB' means Electrical items, or items from Boston or items that get stored on the top shelf or whatever, then that information should be coded into a separate column, leaving a simple number as your primary key.

If there is no meaning to the prefix then why bother, just use a number.
posted by Lanark at 12:52 PM on May 9, 2011


Response by poster: Lanark, you're correct, that's essentially the reason for the letter prefix. The letters and numbers are going to reside in different columns.

The difficulty is that the numbers aren't sequential. This sequence is not the desired behavior:

AA1
AA2
AA3
BB4
AA5

This is the desired behavior:
AA1
AA2
BB1
etc.

If I put both columns down as a two-part primary key, the database will ensure there are never two AA1s, and yet not care if one or the other has duplicates on its own.

Right now I have an autoincrement, unique field which is not the primary key, but which I'll be using as a unique identifier internally, and the two-part primary key.

But I'm not sure if that's sensible, or if there's a much better approach. I'd also like to find a way to make the database increment the numbers paired with the letters sensibly, rather than doing so in code.
posted by jsturgill at 1:35 PM on May 9, 2011


So, what's the practical difference between:

AA1
AA2
BB1
BB2

and

AA1
AA2
BB3
BB4

It sounds like both parts have meaning, a category code and a sequence number within that category (which apparently means something...order of addition to the db?). You don't have to do it that way, and it doesn't have to be your primary key. Given my assumption you could select on a category and order them by created_at or something and get the same result, no composites necessary. Just spitballin'.
posted by rhizome at 1:51 PM on May 9, 2011


Response by poster: The [category][number] identifiers are for people, really, not the database, and definitely a requirement. (If the boss wants it, he gets it!)
posted by jsturgill at 2:11 PM on May 9, 2011


XY Problem
posted by rhizome at 4:06 PM on May 9, 2011


Response by poster: Let me try to restate things, then.

The thing I need to produce is a program built on an Access database that tracks items, each of which has a unique two-part identifier.

Those are requirements that cannot change. The items can have any other number of identifiers behind the scenes, but the two-part letter+number identifier must be something that can be either retrieved or calculated.

My current thought process:

I need to track things simply through the program, so I should give each item a regular autoincrement, unique ID field.

I need to ensure the letters+number identifiers are unique, so I should create a composite primary key, which will cause the database to enforce data entegrity.

What my thought process leads me to ask:

1. Is this a problem that people with more experience or formal training would resolve some other way behind the scenes?

2. Is there some database feature that would allow the database to intelligently increment the number part of the letters+number composite in the manner described previously (a different sequence for each unique letterpart) without having to do so in code?
posted by jsturgill at 4:29 PM on May 9, 2011


Is there some database feature that would allow the database to intelligently increment the number part of the letters+number composite in the manner described previously (a different sequence for each unique letterpart) without having to do so in code?

No, not in Access or any other relational database system I know of.

While its perfectly possible to code something the way you want, it is likely to have rather sluggish performance, particularly when you have multiple users and/or a lot of data.
posted by Lanark at 10:49 AM on May 10, 2011


Response by poster: It really adds that much overhead to use select max with a where clause to get the last number, add one to it, and then use that number to insert the new record?
posted by jsturgill at 4:51 PM on May 10, 2011


A Select Max(..) will do a full table scan, imagine your table holds 1 million rows and 10 people are all inserting new records, that means retrieving 10 million rows!

Worth mentioning that if you want a short readable alphanumeric rather than a long number, you could display it as base 36
posted by Lanark at 11:06 AM on May 11, 2011


Response by poster: Interesting! I think it will be a non-problem due to the size of the database (small) and the number of users (small), but I hadn't realized that was how the Max function worked. Is that the case even on fully indexed rows? The database isn't smart enough to do it any way other than looking at every single row and comparing them?
posted by jsturgill at 7:21 PM on May 11, 2011


Response by poster: And is that also the case for Select Top 1...[where clause]...Order by [numberfield]?
posted by jsturgill at 7:31 PM on May 11, 2011


Indexes will help obviously, but they arent a magic bullet. As you have multiple inserts happening then the indexes need to be updated too.

The tricky thing about database design is that *every* database starts out small. The successful ones always get bigger, sometimes a lot bigger, I always try to build things that retain the possibility to be scaled up if needed.
posted by Lanark at 2:21 PM on May 12, 2011


« Older Percentage to settle a debt?   |   Marvel vs Capcom 3 advice Newer »
This thread is closed to new comments.