Re: remove pg_class.relhaspkey

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: remove pg_class.relhaspkey
Date: 2018-03-10 12:52:56
Message-ID: 126bd0e4-ffe5-83d0-fd63-76bd76d5a2b0@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/26/2018 07:23 AM, Michael Paquier wrote:
> On Mon, Feb 26, 2018 at 12:45:48AM -0500, Tom Lane wrote:
>> Michael Paquier <michael(at)paquier(dot)xyz> writes:
>>> On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote:
>>>> We've discussed that at least twice before, and not pulled the trigger
>>>> for fear of breaking client code.
>>
>>> Speaking of which, I have looked at where relhaspkey is being used. And
>>> there are a couple of things using it:
>>> - Greenplum has a consistency checker tool using it.
>>> - https://github.com/no0p/pgsampler
>>> - https://searchcode.com/codesearch/view/54937539/
>>> - http://codegist.net/code/postgres%20drop%20tables/
>>> - https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs
>>
>> Thanks for poking around. Did you happen to notice how many of these
>> clients are taking pains to deal with the existing inaccuracy of
>> relhaspkey (ie, that it remains set after the pkey has been dropped)?
>
> As far as I looked at things. Those clients rely on how optimistic
> relhaspkey is. In short, if it is set to true, there can be primary
> key definitions. If set to false, then it is sure that no primary key
> definition can be found. If the flag is true, then those clients just
> do an extra lookup on pg_index with indisprimary. I think that this
> just complicates the code involved though. If looking for primary keys
> it is way better to just scan directly pg_index.
>
>> I think there's possibly an argument that relhaspkey should be dropped
>> because it's an attractive nuisance, encouraging clients to assume
>> more than they should about what it means. But we don't have a lot
>> of evidence for such an argument right now.
>
> The only breakage I could imagine here is an application which
> thinks relhaspkey set to true implies that a primary key *has* to be
> present. I have to admit that I have not found such a case. Still I
> would not be surprised if there are such applications unaware of
> being broken, particularly plpgsql functions.

I agree with this sentiment - I don't think those flags are particularly
helpful for client applications, and would vote +1 for removal.

Even if the application handles them correctly (i.e. rechecks existence
when relhaspkey=true), the assumption is that this saves a measurable
amount of work. I'm not so sure about that, considering pretty much all
tables have both primary keys and indexes. OTOH it certainly does make
the code more complicated.

For internal usage it might have made a difference back when those flags
were introduced, but the relcache should deal with this efficiently now,
as pointed out in [1]. But as pointed out, we're not using relhaspkey
internally at all. So +1 to get rid of it.

For the other flags we would probably need to test what impact would it
have (e.g. table with no indexes, many indexes on other tables, and
something calling get_relation_info a lot). But this patch proposes to
remove only relhaspkey.

[1]
https://www.postgresql.org/message-id/CA%2BTgmoYJu24Y8uUAJ4zeQAhoYxLmFxcy%2B5Hdij9ehjoxKo3j3g%40mail.gmail.com

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2018-03-10 12:53:54 Re: Online enabling of checksums
Previous Message Amit Kapila 2018-03-10 11:55:11 Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key