Re: [HACKERS] MERGE SQL Statement for PG11

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-03-20 14:08:00
Message-ID: CABOikdOyLz=BVwvPUUe1AwaNiWpdEkJdQS4q1AgqASDJcvGROQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 18, 2018 at 11:31 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
wrote:

>
>
> On Mon, Mar 12, 2018 at 5:43 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Sun, Mar 11, 2018 at 11:18 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>>
>>>
>>>
>>> As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
>>> the works from Alvaro. In your explanation about that approach that
>>> you cited, you wondered what the trouble might have been with ON
>>> CONFLICT + partitioning, and supposed that the issues were similar
>>> there. Are they? Has that turned up much?
>>>
>>>
>> Well, I initially thought that ON CONFLICT DO UPDATE on partition table
>> may have the same challenges, but that's probably not the case. For INSERT
>> ON CONFLICT it's still just an INSERT path, with some special handling for
>> UPDATEs. Currently, for partition or inherited table, UPDATEs/DELETEs go
>> via inheritance_planner() thus expanding inheritance for the result
>> relation where as INSERTs go via simple grouping_planner().
>>
>> For MERGE, we do all three DMLs. That doesn't mean we could not
>> re-implement MERGE on the lines of INSERTs, but that would most likely mean
>> complete re-writing of the UPDATEs/DELETEs for partition/inheritance
>> tables. The challenges would just be the same in both cases.
>>
>>
> Having thought more about this in the last couple of days, I am actually
> inclined to try out rewrite the UPDATE handling of MERGE on the lines of
> what ON CONFLICT DO UPDATE patch is doing. This might help us to completely
> eliminate invoking inheritance_planner() for partition table and that will
> be a huge win for tables with several hundred partitions. The code might
> also look much cleaner that way. I am gonna give it a try for next couple
> of days and see if its doable.
>
>
So here is a version that completely avoids taking the
inheritance_planner() route and rather handles the MERGE UPDATE/DELETE
during execution, by translating tuples between child and root partition
and vice versa. To achieve this we no longer expand partition inheritance
for result relation, just like INSERT.

Apart from making the code look better, this also gives a very nice boost
to performance. In fact, v20 performed very poorly with 10 or more
partitions. Whereas this version even beats regular UPDATE when there are
10 or more partitions. This is because we are able to avoid calling
grouping_planner() repeatedly for all the partitions, by not expanding
partition tree for the result relation.

This patch is based on the on-going work on ON CONFLICT DO UPDATE for
partitioned table. I've attached the base patch that I am using from that
work and I will re-base MERGE patch once the other patch set takes a final
shape.

Apart from this major change, there are several other changes:

- moved some of the new code to two new source files,
src/backend/executor/nodeMerge.c and src/backend/parser/parse_merge.c. This
necessitated exporting ExecUpdate/ExecInsert/ExecDelete functions from
nodeModifyTable.c, but I think it's still a better change.
- changed the way slots were created per action-state. Instead, now we have
a single slot per result relation and we only change slot-descriptor while
working with specific action. We of course need to track tuple descriptor
per action though
- moved some merge-specific state, such as list of matched and not-matched
actions to result-relation. In case of partitioned table, if the partition
child'd schema differs from the root, then we setup new merge-state for the
partition and transform various members of the action-state to reflect the
child table's schema

Regarding few other things:
- I wrote code for deparsing MERGE, but don't know how to test that
- I could support the system column references to the target relation by
simply removing the blocking error, but it currently does not work for
source relation. I am not sure if we even want to support that. But we
should definitely throw a more user friendly error than what the patch does
today.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
merge_v23b_onconflict_work.patch application/octet-stream 18.7 KB
merge_v23b_main.patch application/octet-stream 309.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-03-20 14:09:32 Re: INOUT parameters in procedures
Previous Message Merlin Moncure 2018-03-20 14:05:53 Re: INOUT parameters in procedures