Quick Links

Re: [HACKERS] MERGE SQL Statement for PG11

From:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] MERGE SQL Statement for PG11
Date:	2018-02-06 06:06:36
Message-ID:	CABOikdM+c1vB_+3tYEjO=J6U2uNHzKU_b=U72tadD5-9xQcbHA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Feb 6, 2018 at 9:50 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Mon, Feb 5, 2018 at 7:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > I don't think you get to make a unilateral decision to exclude
> > features that work everywhere else from the scope of this patch. If
> > there is agreement that those features can be left out of scope, then
> > that is one thing, but so far all the commentary about the things that
> > you've chosen to exclude has been negative. Nor have you really given
> > any reason why they should be exempt. You've pointed out that
> > parallel query doesn't handle everything (which is certainly true, but
> > does not mean that any feature from now and the end of time is allowed
> > to exclude from scope whatever seems inconvenient regardless of
> > contrary community consensus) and you've pointed out here and
> > elsewhere that somebody could go add the features you omitted later
> > (which is also true, but misses the general point that we want
> > committed patches to be reasonably complete already, not have big gaps
> > that someone will have to fix later).
>
> For me, the concern is not really the omission of support for certain
> features as such. The concern is that those omissions hint that there
> is a problem with the design itself, particularly in the optimizer.
> Allowing subselects in the UPDATE part of a MERGE do not seem like
> they could be written as a neat adjunct to what Simon already came up
> with. If that was possible, Simon probably already would have done it.
>
>
As someone who's helping Simon with that part of the code, I must say that
omission of sub-selects in the UPDATE targetlist and WHEN quals is not
because of some known design problems. So while it may be true that we've
a design problem, it's also quite likely that we are missing some
planner/optimiser trick and once we add those missing pieces, it will start
working. Same is the case with RLS.

Partitioned table is something I am actively working on. I must say that
the very fact that INSERT and UPDATE/DELETE take completely different paths
in partitioned/inherited table, makes MERGE quite difficult because it has
to carry out both the operations and hence require all the required
machinery. If I understand correctly, INSERT ON CONFLICT must have faced
similar problems and hence DO UPDATE does not work with partitioned table.
I am not sure if that choice was made when INSERT ON CONFLICT was
implemented or when partitioned table support was added. But the challenges
look similar.

I first tried to treat MERGE similar to UPDATE/DELETE case and ensure that
the INSERTs go through the root partition. That mostly works, but the RIGHT
OUTER join between the child tables and the source relation ends up
emitting duplicate rows, if the partitioned table is the resultRelation and
when it gets expanded in inheritance_planner(). That's a blocker. So what I
am trying now is to push the join between the Append relation and the
source relation below the ModifyTable node, so that we get the final join
result. We can then look up the tableoid in the row returned from the join,
find the corresponding result relation and then carry out MERGE actions.
Note that unlike regular ExecModifyTable(), here we must execute just one
subplan as that will return all the required tuples.

Does anyone see a potential blocker with this approach, except that it may
not be the most elegant way? I think EvalPlanQual might need some treatment
because when the plan is re-executed, it will expect to the find the
updated tuple in the slot of the underlying query's RTE and not in the
resultRelation's RTE, which does not participate in the join at all.
Anything else I could be missing out completely?

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: [HACKERS] MERGE SQL Statement for PG11 at 2018-02-06 04:20:39 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Khandekar	2018-02-06 06:26:27	Re: Query running for very long time (server hanged) with parallel append
Previous Message	Amit Langote	2018-02-06 05:54:12	Re: update tuple routing and triggers