Re: surrogate key or not?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: surrogate key or not?
Date: 2004-08-06 22:42:04
Message-ID: 200408061542.04336.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

David,

> But, once a surrogate key is assigned to a row, doesn't it become a
> "real" data? For example, I have a bunch of invoices/receipts and I
> write down a unique number on each of them. Doesn't the unique number
> become part of the information contained by the invoice/receipt itself
> (at least as long as I'm concerned)?

Sure ... *if* it's being used that way. If, however, your table has that
Invoice # *and* a seperate surrogate key that's redundant and can cause
problems.

> Change management IMO is perhaps the main reason of surrogate/artificial
> key. We often need a PK that _never_ needs to change (because it can be
> a royal PITA or downright impossibility to make this change; the PK
> might already be printed on a form/card/document, recorded on some
> permanent database, tattoed/embedded in someone's forehead, etc).

Sure. But surrogate keys don't fix this problem; only good change management
does. This is precisely why I say "use with caution"; all too often project
leaders regard surrogate keys as a substitute for good change management and
don't do any further work.

> Meanwhile, every other aspect of the data can change (e.g. a person can
> change his name, sex, age, email, address, even date & place of birth).
> Not to mention data entry mistakes. So it's impossible to use any
> "real"/natural key in this case.

Absolutely false. It's quite possible, it's just a performance/schema/data
management issue. This also applies to my comment above.

> Okay, so surrogate key makes it easy for stupid people to design a
> database that is prone to data duplication (because he doesn't install
> enough unique constraints to prevent this). But I don't see how a
> relation with a surrogate key is harder to "normalize" (that is, for the
> duplicates to be removed) than a relation with no key at all. Compare:

You're right here, both are equally hard to normalize. What I'm criticizing
is the tendency of a lot of beginning DBAs -- and even some books on database
design -- to say: "If you've created an integer key, you're done."

Had I my way, I would automatically issue a WARNING on any time you create a
table in PG without a key.

--
-Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2004-08-06 22:52:57 Re: Grouping by week
Previous Message Oliver Elphick 2004-08-06 22:08:31 Re: Grouping by week