Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
Date: 2018-07-23 18:09:43
Message-ID: CA+TgmoYYTCQL=TUYbUz2b=CsQeB4o1L2+_=zHynxGVmyKUtH1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 23, 2018 at 3:49 AM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I have to admit that that is not a good idea. So, I'll update the patch so
> that we don't assume the projection capability of the subplan anymore, if we
> go this way.

Isn't that assumption fundamental to your whole approach?

>> Also, even today, this would fail if the subplan happened to be
>> a Sort, and it's not very obvious why that couldn't happen.
>
> You mean the MergeAppend case? In that case, the patch will adjust the
> subplan before adding a Sort above the subplan, so that could not happen.

That would only help if create_merge_append_plan() itself inserted a
Sort node. It wouldn't help if the Sort node came from a child path
manufactured by create_sort_path().

>> I think that's a bad idea. The target list affects lots
>> of things, like costing. If we don't insert a ConvertRowTypeExpr into
>> the child's target list, the costing will be wrong to the extent that
>> ConvertRowTypeExpr has any cost, which it does.
>
> Actually, this is not true at least currently, because set_append_rel_size
> doesn't do anything about the costing:

Why would it? Append can't project, so the cost of any expressions
that appear in its target list is irrelevant. What is affected is the
cost of the scans below the Append -- see e.g. cost_seqscan(), which
uses the data produced by set_pathtarget_cost_width().

> Some createplan.c routines already change the tlists of their nodes. For
> example, create_merge_append_plan would add sort columns to each of its
> subplans if necessary. I think it would be similar to that to add a
> ConvertRowtypeExpr above a child whole-row Var in the subplan's tlist.

You have a point, but I think that code is actually not a very good
idea, and I'd like to see us do less of that sort of thing, not more.
Any case in which we change the plan while creating it has many of the
same problems that I discussed in the previous email. For example,
create_merge_append_path() has to know that a Sort node might get
inserted and set the costing accordingly. If the callers guaranteed
that the necessary sort path had already been inserted, then we
wouldn't need that special handling.

Also, that code is adding additional columns, computed from the
columns we have available, so that we can sort. Those extra columns
then get discarded at the next level of the Plan tree. What you're
trying to do is different. Perhaps this is too harsh a judgement, but
it looks to me like you're using a deliberately-wrong representation
of the value that you ultimately want to produce and then patching it
up after the fact. That seems quite a bit worse than what the
existing code is doing.

> I reviewed his patch, and thought that that would fix the issue, but this is
> about the current design on the child tlist handling in partitionise join
> rather than that patch: it deviates from the assumption that we had for the
> scan/join planning till PG10 that "a rel's targetlist would only include
> Vars and PlaceHolderVars", and turns out that there are a lot of places
> where we need to modify code to suppress that deviation, as partly shown in
> that patch. So, ISTM that the current design in itself is not that
> localized.

I used to think that was the assumption, too, but it seems to me that
Tom told me a while back that it was never really true, and that
assumption was in my head. Unfortunately, I don't have a link to that
email handy. Either way, I think the solution is to get the tlist
right from the start and cope with the consequences downstream, not
start with the wrong thing and try to fix it later. To see an example
of why I think that's a valuable approach, see
11cf92f6e2e13c0a6e3f98be3e629e6bd90b74d5, especially the changes in
the regression test outputs. The old code discovered after it had
generated an Append that it had the wrong tlist, but since the Append
can't project, it had to add a Result node. With the new code, we get
the children of the Append to produce the right output from the start,
and then the Append just needs to concatenate all that output, so no
Result node is needed. As noted in the commit message, it also made
the costing more accurate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-07-23 18:16:50 Re: How can we submit code patches that implement our (pending) patents?
Previous Message Andrey Borodin 2018-07-23 18:04:57 Re: GiST VACUUM