Re: UPDATE of partition key

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE of partition key
Date: 2017-09-15 11:25:35
Message-ID: CAJ3gD9dH8fahxqic8vJELnkEtkHDRxTSmurfFe1DMWv2tDSqXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12 September 2017 at 12:39, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> On 12 September 2017 at 11:57, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>> On Tue, Sep 12, 2017 at 11:15 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>>
>>> But the statement level trigger function can refer to OLD TABLE and
>>> NEW TABLE, which will contain all the OLD rows and NEW rows
>>> respectively. So the updated rows of the partitions (including the
>>> moved ones) need to be captured. So for OLD TABLE, we need to capture
>>> the deleted row, and for NEW TABLE, we need to capture the inserted
>>> row.
>>
>> Yes, I agree. So in ExecDelete for OLD TABLE we only need to call
>> ExecARUpdateTriggers which will make the entry in OLD TABLE only if
>> transition table is there otherwise nothing and I guess this part
>> already exists in your patch. And, we are also calling
>> ExecARDeleteTriggers and I guess that is to fire the ROW-LEVEL delete
>> trigger and that is also fine. What I don't understand is that if
>> there is no "ROW- LEVEL delete trigger" and there is only a "statement
>> level delete trigger" with transition table still we are making the
>> entry in transition table of the delete trigger and that will never be
>> used.
>
> Hmm, ok, that might be happening, since we are calling
> ExecARDeleteTriggers() with mtstate->mt_transition_capture non-NULL,
> and so the deleted tuple gets captured even when there is no UPDATE
> statement trigger defined, which looks redundant. Will check this.
> Thanks.

I found out that, in case when there is a DELETE statement trigger
using transition tables, it's not only an issue of redundancy; it's a
correctness issue. Since for transition tables both DELETE and UPDATE
use the same old row tuplestore for capturing OLD table, that table
gets duplicate rows: one from ExecARDeleteTriggers() and another from
ExecARUpdateTriggers(). In presence of INSERT statement trigger using
transition tables, both INSERT and UPDATE events have separate
tuplestore, so duplicate rows don't show up in the UPDATE NEW table.
But, nevertheless, we need to prevent NEW rows to be collected in the
INSERT event tuplestore, and capture the NEW rows only in the UPDATE
event tuplestore.

In the attached patch, we first call ExecARUpdateTriggers(), and while
doing that, we first save the info that a NEW row is already captured
(mtstate->mt_transition_capture->tcs_update_old_table == true). If it
captured, we pass NULL transition_capture pointer to
ExecARDeleteTriggers() (and ExecARInsertTriggers) so that it does not
again capture an extra row.

Modified a testcase in update.sql by including DELETE statement
trigger that uses transition tables.

-------

After commit 77b6b5e9c, the order of leaf partitions returned by
RelationGetPartitionDispatchInfo() and the order of the UPDATE result
rels are in the same order. Earlier, because of different orders, I
had to use a hash table to search for the leaf partitions in the
update result rels, so that we could re-use the per-subplan UPDATE
ResultRelInfo's. Now since the order is same, in the attached patch, I
have removed the hash table method, and instead, iterate over the leaf
partition oids and at the same time keep shifting a position over the
per-subplan resultrels whenever the resultrel at the position is found
to be present in the leaf partitions list.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
update-partition-key_v18.patch application/octet-stream 121.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2017-09-15 11:26:39 Re: [PATCH] Off-by-one error in logical slot resource retention
Previous Message Thom Brown 2017-09-15 11:16:22 Re: SendRowDescriptionMessage() is slow for queries with a lot of columns