Re: string primary key

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Mark Gibson <mark(at)gibsonsoftware(dot)com>
Cc: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: string primary key
Date: 2006-05-11 17:20:01
Message-ID: 1147368001.9755.79.camel@state.g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 2006-05-11 at 11:43, Mark Gibson wrote:
> Scott Marlowe wrote:
>
> >
> > If you need a unique constraint on the text field anyway, and it's a
> > natural key, you're generally better of using that field as the pk.
> >
> > However, if it's not a natually unique key, then it shouldn't be the pk,
> > and int is a perhaps better choice.
> >
> > There are two VERY oppositional schools of thought on natural versus
> > artificial keys out there, and neither side is likely to change their
> > minds.
> >
> > My preference is generally for artificial keys (i.e. sequence generated
> > ones) because I've had requirements change underfoot too many times to
> > rely on natural keys all the time.
> >
>
> Thanks for your answer. It sounds like your saying that in terms of
> performance, there is no difference between a character field pk and an
> integer pk.

Sort of. Generally, the int pk-fk relationship will be a tad faster.
However, the maintenance of the unique / primary key index is what
really costs you, and if you've gotta have one unique key (on the text)
the extra time spent mainaining another on an artificial key (in an int)
will lost you as much time as you gain from the faster joins on an
integer. Generally.

> I've got a followup - The primary key for the table in question consists
> of 2 varchar fields: picture 'state' and 'city' where city is guaranteed
> to be unique within a state, and (state, city) form a unique key. This
> sounds like a good candidate for a sequence key. Is there a difference
> in terms of performance in this case?

ahhh. You'll need the unique key anyway, right? Then if you're going
to do 99.999% selects, it will likely be faster to have an artificial
key (i.e. integers from a sequence) than using the natural key, since
the updates will be seldom, if ever.

However, the more updates you do (percentage wise) the more the second
index will cost you for maintenance, and eventually, you'll run slower,
on average, than if you had just the one index.

It's all about usage patterns. Some usage patterns favor one solution
or another. There are few, if any, absolutes. except always make sure
your key types match up.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2006-05-11 17:22:57 Re: Compiling on 8.1.3 on Openserver 5.05
Previous Message Carlos Oliva 2006-05-11 17:15:04 Re: Reindexdb