Primary Keys vs GUIDs
Just thought I would post some code having read Savvy Duck’s comments about the above. Although the project I refer to below was greenfield and didn’t require any kind of database migration, I never encountered any of the performance problems mentioned in some of the related articles.
We were using an Oracle 10g database however, and I’ll report an update if I see any issues with my latest project which is on MySQL and seems to have a bit of a history in this department. But it really was never an issue in any way that we recognized.
The likelihood of a primary key clash was calculated as 1 in 2.27 x 10exp32 for every one million records.
Anyway, the code. We were mapping to Java using Hibernate (avoiding the primary key problem mentioned here) like so:
public abstract class BasePersistentObject implements Serializable
{
// will be overrriden with data read from database
@Id
private String id = UUID.createUUID();
}
I can provide the above UUID code if anyone wants to have a play with this implementation.
Instead of using the standard auto-increment Sequence then, the primary key was implemented with the following function:
/**
* Returns a UID string.
*
* This is string based on a randomly generated bit sequence just like in the app server.
*/
CREATE OR REPLACE FUNCTION f_get_uid RETURN VARCHAR2 IS
BEGIN
RETURN UTL_RAW.CAST_TO_VARCHAR2(
UTL_RAW.TRANSLATE( DBMS_CRYPTO.RANDOMBYTES(24),
uid_util.uid_from_set,
uid_util.uid_to_set ) );
END;
/
-- defines two constants used by f_get_uid() to generate UID strings;
CREATE OR REPLACE PACKAGE uid_util
IS
uid_from_set CONSTANT RAW(256) := UTL_RAW.XRANGE(hextoraw('00'), hextoraw('FF'));
uid_to_set CONSTANT RAW(256) := UTL_RAW.CONCAT(
UTL_RAW.CONCAT(
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('A'), UTL_RAW.CAST_TO_RAW('Z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('a'), UTL_RAW.CAST_TO_RAW('z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('0'), UTL_RAW.CAST_TO_RAW('9')),
UTL_RAW.CAST_TO_RAW('-'),
UTL_RAW.CAST_TO_RAW('_')
),
UTL_RAW.CONCAT(
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('A'), UTL_RAW.CAST_TO_RAW('Z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('a'), UTL_RAW.CAST_TO_RAW('z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('0'), UTL_RAW.CAST_TO_RAW('9')),
UTL_RAW.CAST_TO_RAW('-'),
UTL_RAW.CAST_TO_RAW('_')
),
UTL_RAW.CONCAT(
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('A'), UTL_RAW.CAST_TO_RAW('Z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('a'), UTL_RAW.CAST_TO_RAW('z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('0'), UTL_RAW.CAST_TO_RAW('9')),
UTL_RAW.CAST_TO_RAW('-'),
UTL_RAW.CAST_TO_RAW('_')
),
UTL_RAW.CONCAT(
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('A'), UTL_RAW.CAST_TO_RAW('Z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('a'), UTL_RAW.CAST_TO_RAW('z')),
UTL_RAW.XRANGE(UTL_RAW.CAST_TO_RAW('0'), UTL_RAW.CAST_TO_RAW('9')),
UTL_RAW.CAST_TO_RAW('-'),
UTL_RAW.CAST_TO_RAW('_')
)
);
END;
/
1 Comment to Primary Keys vs GUIDs
Leave a comment
You must be logged in to post a comment.
Agile CTO
- tech_startup: Puppet certificate issues http://t.co/oM6y53rx 4 days ago
- have invented a new UNIX tool for cutting the grass: sudo chmown aeells:aeells squid.conf.bkp sudo: chmown: command not found 5 days ago
- I support #wikipediablackout Show your support here http://t.co/UFN8O0gk 2 weeks ago
- reasonable man adapts himself to world; unreasonable man tries to adapt world to himself; => all progress depends on unreasonable man. 3 weeks ago
- @blinkdesign we could do that too!!!!? ;o) in reply to blinkdesign 3 weeks ago
- @blinkdesign we could do something similar on our tech blog maybe... in reply to blinkdesign 3 weeks ago
- ".....in all things, the supreme excellence is simplicity." Henry Wadsworth Longfellow 3 weeks ago
- good day bootstrapping #tomcat #memcached to #aws server via #puppet - looking forward to provisioning entire production replica in minutes! 3 weeks ago
- great (practical) example of how to do #continousdeployment and branching within teams http://t.co/ceeyvD0h courtesy @chacon @domfarr 2012-01-05
- hi @ruv, mind if i ask how you came by that statistic? cheers! in reply to ruv 2012-01-02
- More updates...



I’ll second the vote for UUIDs as primary keys.
Several years ago, I had a customer with to copies of a database – one “current”, and one where they ran another copy of the application to try to catch up on data entry for past years.
Of course, once they finished the “past years” data entry, they wanted to merge the two databases. Of course, the two databases both used sequences to generate primary keys, and both started the sequences at the same starting values.
Also, of course, I had to advise the customer that it wasn’t worth in terms of cost to do the merge, or the danger to the integrity of the data.
Ever since, I’ve used UUIDs whenever I can.
I’ve never seen a performance hit for doing this, but then, I’ve never worked on a VLDB