Re: pervasiveness of surrogate (also called synthetic) keys

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Misa Simic <misa(dot)simic(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pervasiveness of surrogate (also called synthetic) keys
Date: 2011-05-04 13:51:51
Message-ID: BANLkTiks-z70b9HdyQr_eWDro4dUBLr=dQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, May 4, 2011 at 7:50 AM, Misa Simic <misa(dot)simic(at)gmail(dot)com> wrote:
> 2011/5/4 Merlin Moncure <mmoncure(at)gmail(dot)com>
>>
>> Most of the old school accounting systems maintained an invoice line
>> number.
>> > Invoice Line
>> >     -Invoice Number
>> >     -LineNo
>> >     -ItemID
>> >     -qty
>> >     -Price
>>
>> The line number started from 1 (the first line on the invoice) on
>> every unique invoice.  This has the added benefit of allowing the
>> primary key index (invoice_no, line_no) allowing you to pull up the
>> invoice line records in correct order without requiring an extra sort
>> and, if you don't need any fields from the invoice, a join.
>>
>> Compare the two queries pulling up invoice lines over a range of
>> invoice numbers.
>>
>> merlin
>
>
> Thanks, merlin,
> And in that case, what is "Natural" in LineNo? I would say, with adding
> LineNo we are creating syntethic/surrogate Key (just instead of 1 surrogate
> column - it will be Compound key with more columns...)? The same is with all
> other tables what are "parts" of an Entity, Serial Numbers, Accounting
> Distribution produced by Invoice...etc etc...

It's natural because it contains information that is not synthetic,
and unique/well defined. Specifically, the position of the line on
the invoice; you can't have to invoice lines at the second line of an
invoice for example. This information is visible and important to
users because presumably the invoice lines as entered into an
application are in the order you would want them printed out on a
physical invoice.

The invoice number itself is more interesting case because it's
generated and what it might represent (the 77th invoice for this
customer) isn't very interesting. An actual, 'true' natural key for
an invoice might be a combination of user, entry time, and customer,
but this key is fairly weak and not very useful for identification
purposes by human beings -- so we created the invoice number concept
as a proxy for that information. Also timestamps and floating point
values tend to suck for a primary key. This is what Jeff D was
talking about in detail largely upthread, to wit: if your users (man
or machine) really prefer to work with a generated identifier for some
reason or another, by all means make one, just make sure the record
has a rigorous definition.

A line number table would be a good candidate for adding an additional
surrogate key for purposes of relating on performance grounds if you
have foreign keys pointing to the record.. Any shifts in invoice line
position require renumbering the invoice which would annoyingly
cascade the updates.

The reason, by the way, that I *know* you aren't going to turn up many
interesting cases of not being able to define a key at least in the
accounting and manufacturing world is that I happen to have worked a
large ERP/CRM that dated from the pre-sql era. Like many systems of
the day, it was written in COBOL over an isam data store which didn't
have the technical capability to do what sequences or uuids do today
(or at least, it was more trouble than it was worth)...100% natural
keys. Somehow those guys seemed to manage just fine working under
hardware constraints much tighter than today's Although there were
some real headaches in that system, for example when say, part numbers
changed, it had an inner elegance that grew on me over time, and
taught me the value of rigorous definition of records and their keys.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dickson S. Guedes 2011-05-04 14:20:25 Re: postgresql not updating the sequence
Previous Message Michael Graham 2011-05-04 13:44:49 Rearranging simple where clauses