Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, amul sul <sulamul(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
Date: 2018-03-08 18:00:21
Message-ID: CA+TgmoaU=ZW1Ox9mz28kPEasOt6V-S7G9ZQw=Qxcho3scasnQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 8, 2018 at 12:25 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> I think the question is: isn't there an alternate way to achieve the same
> result? One alternate way would be to do what I suggested above i.e. free up
> more bits and use one of those.

That's certainly possible, but TBH the CTID field seems like a pretty
good choice for this particular feature. I mean, we're essentially
trying to indicate that the CTID link is not valid, so using an
invalid value in the CTID field seems like a pretty natural choice.
We could use, say, an infomask bit to indicate that the CTID link is
not valid, but an infomask bit is more precious. Any two-valued
property can be represented by an infomask bit, but using the CTID
field is only possible for properties that can't be true at the same
time that the CTID field needs to be valid. So it makes sense that
this property, which can't be true at the same time the CTID field
needs to be valid, should try to use an otherwise-unused bit pattern
for the CTID field itself.

> Another way would be to add a hidden column
> to the partition table, when it is created or when it is attached as a
> partition. This only penalises the partition tables, but keeps rest of the
> system out of it. Obviously, if this column is added when the table is
> attached as a partition, as against at table creation time, then the old
> tuple may not have room to store this additional field. May be we can handle
> that by double updating the tuple? That seems bad, but then it only impacts
> the case when a partition key is updated. And we can clearly document
> performance implications of that operation. I am not sure how common this
> case is going to be anyways. With this hidden column, we can even store a
> pointer to another partition and do something with that, if at all needed.

Sure, but that would mean that partitioned tables would get bigger as
compared with unpartitioned tables, it would break backward
compatibility with v10, and it would require a major redesign of the
system -- the list of "system" columns is deeply embedded in the
system design and previous proposals to add to it have not been met
with wild applause.

> That's just one idea. Of course, I haven't thought about it for more than
> 10mins, so most likely I may have missed out on details and it's probably a
> stupid idea afterall. But there could be other ideas too. And even if we
> can't find one, my vote would be to settle for #1 instead of trying to do
> #2.

Fair enough. I don't really see a reason why we can't make #2 work.
Obviously, the patch touches the on-disk format and is therefore scary
-- that's why I thought it should be broken out of the main update
tuple routing patch -- but it's far less of a structural change than
Alvaro's multixact work or the WARM stuff, at least according to my
current understanding. Tom said he thinks it's riskier than the
multixact stuff but I don't see why that should be the case. That had
widespread impacts on vacuuming and checkpointing that are not at
issue here. Still, there's no question that it's a scary patch and if
the consensus is now that we don't need it -- so be it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-03-08 18:00:36 Re: [HACKERS] proposal: schema variables
Previous Message Robert Haas 2018-03-08 17:29:21 Re: Temporary tables prevent autovacuum, leading to XID wraparound