Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: amul sul <sulamul(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
Date: 2018-01-26 06:28:35
Message-ID: CAA4eK1LrOf+ZGkjMbiL7GjvFXoQ_O+XPyXHTL-L5yhd+WU4NWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 24, 2018 at 12:44 PM, amul sul <sulamul(at)gmail(dot)com> wrote:
> On Tue, Jan 23, 2018 at 7:01 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Fri, Jan 12, 2018 at 11:43 AM, amul sul <sulamul(at)gmail(dot)com> wrote:
> [....]
>> I have asked to change the message "tuple to be updated .." after
>> heap_lock_tuple call not in nodeModifyTable.c, so please revert the
>> message in nodeModifyTable.c.
>>
>
> Understood, fixed in the attached version, sorry I'd missed it.
>
>> Have you verified the changes in execReplication.c in some way? It is
>> not clear to me how do you ensure to set the special value
>> (InvalidBlockNumber) in CTID for delete operation via logical
>> replication? The logical replication worker uses function
>> ExecSimpleRelationDelete to perform Delete and there is no way it can
>> pass the correct value of row_moved in heap_delete. Am I missing
>> something due to which we don't need to do this?
>>
>
> You are correct, from ExecSimpleRelationDelete, heap_delete will always
> receive row_moved = false and InvalidBlockNumber will never set.
>

So, this means that in case of logical replication, it won't generate
the error this patch is trying to introduce. I think if we want to
handle this we need some changes in WAL and logical decoding as well.

Robert, others, what do you think? I am not very comfortable leaving
this unaddressed, if we don't want to do anything about it, at least
we should document it.

> I didn't found any test case to hit changes in execReplication.c. I am not sure
> what should we suppose do here, and having LOG is how much worse either.
>

I think you can manually (via debugger) hit this by using
PUBLICATION/SUBSCRIPTION syntax for logical replication. I think what
you need to do is in node-1, create a partitioned table and subscribe
it on node-2. Now, perform an Update on node-1, then stop the logical
replication worker before it calls heap_lock_tuple. Now, in node-2,
update the same row such that it moves the row. Now, continue the
logical replication worker. I think it should hit your new code, if
not then we need to think of some other way.

> What do you think, should we add an assert like EvalPlanQualFetch() here or
> the current LOG message is fine?
>

I think first let's try to hit this case.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-01-26 06:30:50 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Fabien COELHO 2018-01-26 06:28:19 Re: General purpose hashing func in pgbench