Re: pervasiveness of surrogate (also called synthetic) keys

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: pervasiveness of surrogate (also called synthetic) keys
Date: 2011-05-02 15:10:38
Message-ID: 4DBEC96E.8070901@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 05/01/2011 06:12 PM, Karsten Hilbert wrote:
> Good to know since I'm only a lowly medical doctor not
> having much schooling in database matters beyond this list,
> the PostgreSQL docs, and the Celko book.
>

This debate exists at all levels of experience, and the only thing that
changes as you get more experienced people involved is an increase in
anecdotes on each side. The sole time I ever found myself arguing with
Joe Celko is over an article he wrote recommending natural keys, using
an example from the automotive industry. Problem was, the specific
example he gave was flat out wrong. I was working in automotive MIS at
the time, and the thing he was saying would never change did, in fact,
change every year--in only a fraction of a percent of cases, in an
extremely subtle way that snuck up on people and wreaked much
confusion. That's typical for an early natural key design: you get it
working fine in V1.0, only to discover months or years down the road
there's a case you never considered you don't model correctly, and it
may take some sort of conversion to fix.

The reason why there's a strong preference for surrogate keys is that
they always work and you can avoid ever needing to come up with a better
design. if you just use them and forget about it. The position Merlin
has advocated here, that there should always be a natural key available
if you know the data well enough, may be true. But few people are good
enough designers to be sure they've made the decision correctly, and the
downsides of being wrong can be a long, painful conversion process.
Easier for most people to just eliminate the possibility of making a
mistake by using auto-generated surrogate keys, where the primary
problem you'll run into is merely using more space/resources than you
might otherwise need to have. It minimizes the worst-case--mistake make
in the model, expensive re-design--by adding overhead that makes the
average case more expensive. Software design usually has enough risks
that any time you can eliminate one just by throwing some resources at
it, that's normally the right thing to do.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marek Więckowski 2011-05-02 15:17:11 Re: auto-reconnect: temp schemas, sequences, transactions
Previous Message Raymond O'Donnell 2011-05-02 14:57:59 Re: wnat ot edit pg_hba.conf file from command prompt