Re: Postgres-R: primary key patches

From: chris <cbbrowne(at)ca(dot)afilias(dot)info>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgres-R: primary key patches
Date: 2008-07-21 14:45:09
Message-ID: 87abgbuwi2.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

markus(at)bluegap(dot)ch (Markus Wanner) writes:
> chris wrote:
>> I agree with you that tables are *supposed* to have primary keys;
>> that's proper design, and if tables are missing them, then something
>> is definitely broken.
>
> Ah, I see, so you are not concerned about tables with a PRIMARY KEY
> for which one wants another REPLICATION KEY, but only about tables
> without a PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the
> first place.

"Doesn't want" is probably overstating the matter.

I'll describe a scenario to suggest where it might happen.

- A system is implemented, using the database, and, for some reason,
no PRIMARY KEY is defined for a table. Someone forgot; it got
misconfigured; a mistake was probably made.

- The system then goes into production, and runs for a while. The
table has data added to it, and starts to grow rather large.

- At THIS point, we decide to introduce replication, only to discover
that there isn't a PRIMARY KEY on the table.

Ideally, we'd take an outage and add the primary key. But suppose we
can't afford to do so?

The "add indexes concurrently" added in 8.3 (if memory serves) *would*
allow us to create a *candidate* primary key without forcing an
outage.

In theory, we'd like to have a true primary key. Sometimes
operational issues get in the way.

> However, that's a general limitation of replication at tuple level:
> you need to be able to uniquely identify tuples. (Unlike replication
> on storage level, which can use the storage location for that).

No disagreement; yes, we certainly do need a way to uniquely identify
tuples, otherwise we can't replicate UPDATE or DELETE.

>> Sometimes, unfortunately, people make errors in design, and we wind up
>> needing to accomodate situations that are "less than perfect."
>>
>> The "happy happenstance" is that, in modern versions of PostgreSQL, a
>> unique index may be added in the background so that this may be
>> rectified without outage if you can live with a "candidate primary
>> key" rather than a true PRIMARY KEY.
>
> I cannot see any reason for not wanting a PRIMARY KEY, but wanting
> replication, and therefore a REPLICATION KEY.
>
> Or are you saying we should add a hidden REPLICATION KEY for people
> who are afraid of schema changes and dislike a visible primary key?
> Would you want to hide the underlying index as well?

The scenario I outline above hopefully answers this. It's not a
matter that I expect people to specifically desire not to have a
primary key. Instead, I expect cases where mistakes compound with
operational issues to make them say "Ow - I can't do that now!"
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules of the Evil Overlord #145. "My dungeon cell decor will not
feature exposed pipes. While they add to the gloomy atmosphere, they
are good conductors of vibrations and a lot of prisoners know Morse
code." <http://www.eviloverlord.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2008-07-21 14:55:10 Re: Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the
Previous Message Tom Lane 2008-07-21 14:44:12 Re: overlaps performance