Re: pervasiveness of surrogate (also called synthetic) keys

From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: pervasiveness of surrogate (also called synthetic) keys
Date: 2011-04-28 21:07:42
Message-ID: 20110428210742.GN3783@hermes.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Apr 28, 2011 at 03:39:19PM -0500, Merlin Moncure wrote:

> They are fairly pervasive, and increasingly so, which I find to be
> really unfortunate. Personally I think rote use of surrogate keys is
> terrible and leads to bad table designs, especially if you don't
> identify the true natural key with, say, a unique constraint.

I was recently asked on this list why GNUmed uses all those
surrogate keys.

I should have added to my answer that we DO make *extensive*
use of all sorts of built-in constraints and custom triggers
to enforce "natural" keys. I must agree with a recent poster
that what appears to identify as a natural key often really
isn't or else becomes not so later on. It's vastly easier to
then deal with that by re-defining constraints without
having to touch primary keys.

> This
> pushes duplicate enforcement out of the database and into application
> code, or (even worse) the user. What everyone misses in the database
> is that natural keys force good database design...if you can't
> discover one, you probably have a design problem.

Sure but that doesn't mean you need to actually *use*
natural keys as primary keys - enforce them with all sorts
of constraints, sure - but don't assume you properly figured
out the perfect schema the first time around.

I've so far found it *good* to have duplicate enforcement:

- the database enforces what we agree on the final data
*should* look like

- the UI tries to lure the user into entering "valid" data

Now, those ("duplicative") database constraints have saved
our butt more than once preventing faulty patient data to be
inserted into medical records.

> A lot of bad arguments made against natural keys are made, for example:
> *) natural keys can change (so what? unless you are worried about cascades)

I find it is not so much that they can change:

Sure, it doesn't matter whether a certain code reads
"C03EB21" or "C03EB22".

but rather that they tend to go non-unique whenever the
whims of meatspace decide it's now more convenient to allow
dupes:

Suddenly there must be two records with code "C03EB21".

> *) SSN are reused! (SSN is obviously not a complete key if you want to
> identify a person)
> *) most tables don't have unique natural keys (let's see em)

Now, those two arguments are bogus, I agree.

Karsten
--
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Davis 2011-04-28 21:44:37 Re: New feature: skip row locks when table is locked.
Previous Message David Johnston 2011-04-28 21:00:54 Re: pervasiveness of surrogate (also called synthetic) keys