Re: Question about some changes in 11.3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mat Arye <mat(at)timescale(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about some changes in 11.3
Date: 2019-06-03 20:07:28
Message-ID: 19587.1559592448@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mat Arye <mat(at)timescale(dot)com> writes:
> On Fri, May 24, 2019 at 5:05 PM Mat Arye <mat(at)timescale(dot)com> wrote:
>> 11.3 included some change to partition table planning. Namely
>> commit 925f46f ("Fix handling of targetlist SRFs when scan/join relation is
>> known empty.") seems to redo all paths for partitioned tables
>> in apply_scanjoin_target_to_paths. It clears the paths in:
>>
>> ```
>> if (rel_is_partitioned)
>> rel->pathlist = NIL
>> ```
>>
>> Then the code rebuild the paths. However, the rebuilt append path never
>> gets the
>> set_rel_pathlist_hook called. Thus, the work that hook did previously gets
>> thrown away and the rebuilt append path can never be influenced by this
>> hook. Is this intended behavior? Am I missing something?

Hm. I'd say this was already broken by the invention of
apply_scanjoin_target_to_paths; perhaps 11-before-11.3 managed to
still work for you, but it's not hard to envision applications of
set_rel_pathlist_hook for which it would not have worked. The contract
for set_rel_pathlist_hook is supposed to be that it gets to editorialize
on all normal (non-Gather) paths created by the core code, and that's
no longer the case now that apply_scanjoin_target_to_paths can add more.

> I've attached a small patch to address this discrepancy for when the
> set_rel_pathlist_hook is called so that's it is called for actual paths
> used for partitioned rels. Please let me know if I am misunderstanding how
> this should be handled.

I'm not very happy with this patch either, as it makes the situation
even more confused, not less so. The best-case scenario is that the
set_rel_pathlist_hook runs twice and does useless work; the worst case
is that it gets confused completely by being called twice for the same
rel. I think we need to maintain the invariant that that hook is
called exactly once per baserel.

I wonder whether we could fix matters by postponing the
set_rel_pathlist_hook call till later in the same cases where
we postpone generate_gather_paths, ie, it's the only baserel.

That would make its name pretty misleading, though. Maybe we
should leave it alone and invent a separate hook to be called
by/after apply_scanjoin_target_to_paths? Although I don't
know if it'd be useful to add a new hook to v11 at this point.
Extensions would have a hard time knowing if they could use it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2019-06-03 20:31:33 Re: Index Skip Scan
Previous Message Melanie Plageman 2019-06-03 19:23:33 Sort support for macaddr8