Re: UPDATE of partition key

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE of partition key
Date: 2017-02-24 09:54:22
Message-ID: CANP8+jJp1S07PU_QbfiAqQTTO1r2SmfPGjgJFm-BBpriNqBSbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 February 2017 at 07:02, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Feb 20, 2017 at 2:58 PM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>> I am inclined to at least have some option for the user to decide the
>> behaviour. In the future we can even consider support for walking
>> through the ctid chain across multiple relfilenodes. But till then, we
>> need to decide what default behaviour to keep. My inclination is more
>> towards erroring out in an unfortunate even where there is an UPDATE
>> while the row-movement is happening. One option is to not get into
>> finding whether the DELETE was part of partition row-movement or it
>> was indeed a DELETE, and always error out the UPDATE when
>> heap_update() returns HeapTupleUpdated, but only if the table is a
>> leaf partition. But this obviously will cause annoyance because of
>> chances of getting such errors when there are concurrent updates and
>> deletes in the same partition. But we can keep a table-level option
>> for determining whether to error out or silently lose the UPDATE.
>
> I'm still a fan of the "do nothing and just document that this is a
> weirdness of partitioned tables" approach, because implementing
> something will be complicated, will ensure that this misses this
> release if not the next one, and may not be any better for users. But
> probably we need to get some more opinions from other people, since I
> can imagine people being pretty unhappy if the consensus happens to be
> at odds with my own preferences.

I'd give the view that we cannot silently ignore this issue, bearing
in mind the point that we're expecting partitioned tables to behave
exactly like normal tables.

In my understanding the issue is that UPDATEs will fail to update a
row when a valid row exists in the case where a row moved between
partitions; that behaviour will be different to a standard table.

It is of course very good that we have something ready for this
release and can make a choice of what to do.

Thoughts

1. Reuse the tuple state HEAP_MOVED_OFF which IIRC represent exactly
almost exactly the same thing. An UPDATE which gets to a
HEAP_MOVED_OFF tuple will know to re-find the tuple via the partition
metadata, or I might be persuaded that in-this-release it is
acceptable to fail when this occurs with an ERROR and a retryable
SQLCODE, since the UPDATE will succeed on next execution.

2. I know that DB2 handles this by having the user specify WITH ROW
MOVEMENT to explicitly indicate they accept the issue and want update
to work even with that. We could have an explicit option to allow
that. This appears to be the only way we could avoid silent errors for
foreign table partitions.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-02-24 10:00:07 Making clausesel.c Smarter
Previous Message Robert Haas 2017-02-24 09:53:32 Re: Patch: Write Amplification Reduction Method (WARM)