Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
Date: 2018-07-24 11:51:36
Message-ID: 5B5712C8.9030404@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2018/07/24 3:09), Robert Haas wrote:
> On Mon, Jul 23, 2018 at 3:49 AM, Etsuro Fujita
> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> I have to admit that that is not a good idea. So, I'll update the patch so
>> that we don't assume the projection capability of the subplan anymore, if we
>> go this way.
>
> Isn't that assumption fundamental to your whole approach?

I don't think so. What I mean here is: currently the subplan would be a
scan/join node, but in future we might have eg, a Sort node atop the
scan/join node, so it would be better to update the patch to handle such
a case as well.

>>> Also, even today, this would fail if the subplan happened to be
>>> a Sort, and it's not very obvious why that couldn't happen.
>>
>> You mean the MergeAppend case? In that case, the patch will adjust the
>> subplan before adding a Sort above the subplan, so that could not happen.
>
> That would only help if create_merge_append_plan() itself inserted a
> Sort node. It wouldn't help if the Sort node came from a child path
> manufactured by create_sort_path().

As I mentioned above, I think we could handle such a case as well.

>>> I think that's a bad idea. The target list affects lots
>>> of things, like costing. If we don't insert a ConvertRowTypeExpr into
>>> the child's target list, the costing will be wrong to the extent that
>>> ConvertRowTypeExpr has any cost, which it does.
>>
>> Actually, this is not true at least currently, because set_append_rel_size
>> doesn't do anything about the costing:
>
> Why would it? Append can't project, so the cost of any expressions
> that appear in its target list is irrelevant. What is affected is the
> cost of the scans below the Append -- see e.g. cost_seqscan(), which
> uses the data produced by set_pathtarget_cost_width().

By set_rel_size()?

>> Some createplan.c routines already change the tlists of their nodes. For
>> example, create_merge_append_plan would add sort columns to each of its
>> subplans if necessary. I think it would be similar to that to add a
>> ConvertRowtypeExpr above a child whole-row Var in the subplan's tlist.
>
> You have a point, but I think that code is actually not a very good
> idea, and I'd like to see us do less of that sort of thing, not more.
> Any case in which we change the plan while creating it has many of the
> same problems that I discussed in the previous email. For example,
> create_merge_append_path() has to know that a Sort node might get
> inserted and set the costing accordingly. If the callers guaranteed
> that the necessary sort path had already been inserted, then we
> wouldn't need that special handling.

I'm not sure that's a good idea, because I think we have a trade-off
relation; the more we make create_plan simple, the more we need to make
earlier states of the planner complicated.

And it looks to me like the partitionwise join code is making earlier
(and later) stages of the planner too complicated, to make create_plan
simple.

> Also, that code is adding additional columns, computed from the
> columns we have available, so that we can sort. Those extra columns
> then get discarded at the next level of the Plan tree. What you're
> trying to do is different. Perhaps this is too harsh a judgement, but
> it looks to me like you're using a deliberately-wrong representation
> of the value that you ultimately want to produce and then patching it
> up after the fact.

When considering paritionwise joining, it would make things complicated
to have a ConvertRowtypeExpr in a partrel's targetlist, because as
discussed upthread, it deviates from the planner's assumption that a
rel's targetlist would only include Vars and PHVs. So, I propose to
include a child whole-row Var in the targetlist instead, in which case,
we need to fix the Var after the fact, but can avoid making many other
parts of the planner complicated.

> That seems quite a bit worse than what the
> existing code is doing.

One idea to address that concern would be to adjust the pathtarget of
each path created in generate_partitionwise_join_paths accordingly. I'm
missing something, though.

>> I reviewed his patch, and thought that that would fix the issue, but this is
>> about the current design on the child tlist handling in partitionise join
>> rather than that patch: it deviates from the assumption that we had for the
>> scan/join planning till PG10 that "a rel's targetlist would only include
>> Vars and PlaceHolderVars", and turns out that there are a lot of places
>> where we need to modify code to suppress that deviation, as partly shown in
>> that patch. So, ISTM that the current design in itself is not that
>> localized.
>
> I used to think that was the assumption, too, but it seems to me that
> Tom told me a while back that it was never really true, and that
> assumption was in my head. Unfortunately, I don't have a link to that
> email handy. Either way, I think the solution is to get the tlist
> right from the start and cope with the consequences downstream, not
> start with the wrong thing and try to fix it later.

I'm not fully convinced that's really the right solution, but I think
you know better about partitioning (as well as many other things) than
me, so I think it's better for you to decide what to do for this issue.

Thanks for working on this!

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message PG Bug reporting form 2018-07-24 12:13:34 BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events
Previous Message Masahiko Sawada 2018-07-24 11:38:30 Re: [WIP] [B-Tree] Retail IndexTuple deletion