Re: Basic Q on superfluous primary keys

From: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Basic Q on superfluous primary keys
Date: 2007-04-16 15:02:24
Message-ID: 46239000.7000708@modgraph-usa.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Merlin Moncure wrote:
> Using surrogate keys is dangerous and can lead to very bad design
> habits that are unfortunately so prevalent in the software industry
> they are virtually taught in schools. ... While there is
> nothing wrong with them in principle (you are exchanging one key for
> another as a performance optimization), they make it all too easy to
> create denormalized designs and tables with no real identifying
> criteria, etc,...

Wow, that's the opposite of everything I've ever been taught, and all my experience in the last few decades.

I can't recall ever seeing a "natural" key that was immutable. In my business (chemistry), we've seen several disasterous situations were companies picked keys they thought were natural and immutable, and years down the road they discovered (for example) that chemical compounds they thought were pure were in fact isotopic mixtures, or simply the wrong molecule (as analytical techniques improved). Or during a corporate takeover, they discovered that two companies using the same "natural" keys had as much as 10% differences in their multi-million-compound databases. These errors led to six-month to year-long delays, as each of the conflicting chemical record had to be examined by hand by a PhD chemist to reclassify it.

In other businesses, almost any natural identifier you pick is subject to simple typographical errors. When you discover the errors in a field you've used as a primary key, it can be quite hard to fix, particularly if you have distributed data across several systems and schemas.

We've always recommended to our customers that all primary keys be completely information free. They should be not based on any information or combination of information from the data records. Every time the customer has not followed this advice, they've later regretted it.

I'm sure there are situations where a natural key is appropriate, but I haven't seen it in my work.

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2007-04-16 15:55:29 Re: Basic Q on superfluous primary keys
Previous Message Merlin Moncure 2007-04-16 13:03:42 Re: Basic Q on superfluous primary keys