Re: [HACKERS] Runtime Partition Pruning

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Beena Emerson <memissemerson(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: [HACKERS] Runtime Partition Pruning
Date: 2020-10-07 09:05:36
Message-ID: CAKU4AWrzi7f1Y1J8q0xWO8fXOwf-BtdG+M9_7bVPnQyd5cLS0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 4, 2020 at 3:10 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:

>
>>
>> Now, in my experience, the current system for custom plans vs. generic
>> plans doesn't approach the problem in this way at all, and in my
>> experience that results in some pretty terrible behavior. It will do
>> things like form a custom plan every time because the estimated cost
>> of the custom plan is lower than the estimated cost of the generic
>> plan even though the two plans are structurally identical; only the
>> estimates differ. It will waste gobs of CPU cycles by replanning a
>> primary key lookup 5 times just on the off chance that a lookup on the
>> primary key index isn't the best option. But this patch isn't going
>> to fix any of that. The best we can probably do is try to adjust the
>> costing for Append paths in some way that reflects the costs and
>> benefits of pruning. I'm tentatively in favor of trying to do
>> something modest in that area, but I don't have a detailed proposal.
>>
>>
> I just realized this issue recently and reported it at [1], then Amit
> pointed
> me to this issue being discussed here, so I would like to continue this
> topic
> here.
>
> I think we can split the issue into 2 issues. One is the partition prune
> in initial
> partition prune, which maybe happen in custom plan case only and caused
> the above issue. The other one happens in the "Run-Time" partition prune,
> I admit that is an important issue to resolve as well, but looks harder.
> So I
> think we can fix the first one at first.
>
> ... When we count for the cost of a
> generic plan, we can reduce the cost based on such information.
>

This way doesn't work since after the initial partition prune, not only the
cost of the Append node should be reduced, the cost of other plans should
be reduced as well [1]

However I think if we can use partition prune information from a custom plan
at the cost_append_path stage, it looks the issue can be fixed. If so,
the idea
is similar to David's idea in [2], however Robert didn't agree with
this[2].
Can anyone elaborate this objection? for a partkey > $1 or BETWEEN cases,
some real results from the past are probably better than some hard-coded
assumptions IMO.

[1]
https://www.postgresql.org/message-id/CAKU4AWrWSCFO5fh01GTnN%2B1T8K8MyVAi4Gw-TvYC-Vhx3JohUw%40mail.gmail.com

[2]
https://www.postgresql.org/message-id/CAKJS1f8q_d7_Viweeivt1eS4Q8a0WAGFbrgeX38468mVgKseTA%40mail.gmail.com

[3]
https://www.postgresql.org/message-id/CA%2BTgmoZv8sd9cKyYtHwmd_13%2BBAjkVKo%3DECe7G98tBK5Ejwatw%40mail.gmail.com

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2020-10-07 09:18:14 Re: Add a log message on recovery startup before syncing datadir
Previous Message Pavel Stehule 2020-10-07 09:00:40 Re: proposal: unescape_text function