Re: Improve choose_custom_plan for initial partition prune case

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improve choose_custom_plan for initial partition prune case
Date: 2020-10-07 11:39:43
Message-ID: CAExHW5tq+rgecxVLx=i+auqYX1tA2bye_kpTY8z8GdnjOHUjmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 7, 2020 at 11:20 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
> Hi Ashutosh:
>
> Thanks for coming.
>
> On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>>
>> On Thu, Oct 1, 2020 at 9:34 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>> >
>> > Given the plan example:
>> >
>> > CREATE TABLE measurement (
>> > city_id int not null,
>> > logdate date not null,
>> > peaktemp int,
>> > unitsales int
>> > ) PARTITION BY RANGE (logdate);
>> >
>> > CREATE TABLE measurement_y2006m02 PARTITION OF measurement
>> > FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>> >
>> > CREATE TABLE measurement_y2006m03 PARTITION OF measurement
>> > FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>> >
>> > prepare s as select * from measurement where logdate = $1;
>> > execute s('2006-02-01').
>> >
>> > The generic plan will probably not be chosen because it doesn't reduce the cost
>> > which can be reduced at initial_prune while the custom plan reduces such cost
>> > at planning time. which makes the cost comparison not fair. I'm thinking if we can
>> > get an estimated cost reduction of initial_prunne for generic plan based on the
>> > partition pruned at plan time from custom plan and then reducing
>> > such costs from the generic plan. I just went through the related code but
>> > didn't write anything now. I'd like to see if this is a correct direction to go.
>>
>> What happens when we
>> execute plans with values that have estimates similar to the generic
>> plan later when we moderate generic plan costs based on the custom
>> plans?
>>
>
> The example at the beginning of this thread, I used the exact same values
> every time, the custom plan will be chosen all the time, which is bad,
> The main reason is the custom plan knows the exact value in Execute
> message, so it run plan time partition prune, then the total cost is low, however
> for the generic plan the partition prune happens at Runtime initial_partition prune
> stage, so the cost of the partitions which can be pruned at that stage is still
> included the total cost, so generic plans can't be chosen. that would be the
> thing I want to fix.

Something is wrong in the generic plan costing then. IIUC, the
selectivity estimate for only a single partition should come out >= 1.
For all the other partitions, it should be 1 and that too because we
clamp the row counts. So the final costs for generic and custom plans
shouldn't be far off unless there's large deviation in the selectivity
of a partition key. I am assuming that there's an equality condition
on a partition key. That's what I meant by the paragraph below.

>
>> If the table has good distribution of a partition key, which also
>> results in good distribution of data across partitions, generic plan
>> cost will be similar to the custom plan costs. If not that's something
>> we want to fix.

Can you please investigate on these lines?

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message iwata.aya@fujitsu.com 2020-10-07 11:52:15 RE: libpq debug log
Previous Message Maksim Kita 2020-10-07 11:31:54 [PATCH] ecpg: fix progname memory leak