Re: Basic Q on superfluous primary keys

From: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Basic Q on superfluous primary keys
Date: 2007-04-18 16:05:13
Message-ID: 462641B9.5020102@modgraph-usa.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Merlin Moncure wrote:
>> Since the part number is NOT the primary key, duplicate part numbers
>> are not a problem. If you had instead used the part number as the
>> primary key, you'd be dead in the water.
>
> You are redefining the primary key to be (part_number,
> obsoletion_date). Now, if you had not anticipated that in the
> original design (likely enough), you do have to refactor queries that
> join on the table...so what? If that's too much work, you can use a
> view to take care of the problem (which may be a good idea anyways).
> *you have to refactor the system anyways because you are now allowing
> duplicate part numbers where previously (from the perspective of the
> user), they were unique *.
>
> The hidden advantage of pushing the full key through the database is
> it tends to expose holes in the application/business logic. Chances
> are some query is not properly distinguishing obsoleted parts and now
> the real problems come...surrogate keys do not remove complexity, they
> simply sweep it under the rug.

This really boils down to an object-oriented perspective. I have an object, a customer's catalog entry. It has properties such as catalog number, description, etc, and whether it's obsolete or not. Management of the object (its relation to other objects, its history, etc.) should NOT depend on the object's specific definition.

This is true whether the object is represented in Lisp, C++, Perl, or (in this case) an SQL schema. Good object oriented design abstracts the object and its behavior from management of the object. In C++, Perl, etc., we manage objects via a pointer or object reference. In SQL, we reference objects by an *arbitrary* integer that is effectively a pointer to the object.

What you're suggesting is that I should break the object-oriented encapsulation by pulling out specific fields of the object, exposing those internal object details to the applications, and spreading those details across the whole schema. And I argue that this is wrong, because it breaks encapsulation. By exposing the details of the object, if the details change, *all* of your relationships break, and all of your applications have to change. And I've never seen a system where breaking object-oriented encapsulation was a good long-term solution. Systems change, and object-oriented techniques were invented to help manage change.

This is one of the reasons the Postgres project was started way back when: To bring object-oriented techniques to the relational-database world.

Craig

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2007-04-18 17:19:59 Re: Shared buffers, db transactions commited, and write IO on Solaris
Previous Message Csaba Nagy 2007-04-18 15:36:37 Re: Foreign Key Deadlocking