Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.
Date: 2020-05-29 00:16:02
Message-ID: 20200529001602.eu7vuiouuuiclpgb@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 27, 2020 at 09:58:04PM +0800, Andy Fan wrote:
>On Wed, May 27, 2020 at 8:01 PM Ashutosh Bapat <
>ashutosh(dot)bapat(at)2ndquadrant(dot)com> wrote:
>
>>
>>
>> On Wed, 27 May 2020 at 04:43, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>>
>>> You can use the attached sql to reproduce this issue, but I'm not sure
>>> you can
>>> get the above result at the first time that is because when optimizer
>>> think the
>>> 2 index scan have the same cost, it will choose the first one it found,
>>> the order
>>> depends on RelationGetIndexList. If so, you may try drop and create
>>> j1_i_im5 index.
>>>
>>> The sense behind this patch is we still use the cost based optimizer,
>>> just when we
>>> we find out the 2 index scans have the same cost, we prefer to use the
>>> index which
>>> have more qual filter on Index Cond. This is implemented by adjust the
>>> qual cost
>>> on index filter slightly higher.
>>>
>>
>> Thanks for the example and the explanation.
>>
>> The execution time difference in your example is pretty high to account
>> for executing the filter on so many rows. My guess is this has to do with
>> the heap access. For applying the filter the entire row needs to be fetched
>> from the heap. So we should investigate this case from that angle. Another
>> guess I have is the statistics is not correct and hence the cost is wrong.
>>
>>
>I believe this is a statistics issue and then the cost is wrong.

I think you're both right. Most of the time probably comes from the
heap accesses, but the dabatabase has no chance to account for that
as there was no analyze after inseting the data causing that. So it's
very difficult to account for this when computing the cost.

>More characters of this issue are: 1). If a data is out of range in
>the old statistics, optimizer will given an 1 row assumption. 2).
>based on the 1 row assumption, for query "col1=out_of_range_val AND
>col2 = any_value" Index (col1, col2) and (col1, col3) will have
>exactly same cost for current cost model. 3). If the statistics was
>wrong, (col1, col3) maybe a very bad plan as shown above, but index
>(col1, col2) should always better/no worse than (col1, col3) in any
>case. 4). To expand the rule, for query "col1 = out_of_range_val AND
>col2 = any_value AND col3 = any_val", index are (col1, col2, col_m) and
>(col1, col_m, col_n), the former index will aways has better/no worse
>than the later one. 5). an statistics issue like this is not
>uncommon, for example an log based application, creation_date is very
>easy to out of range in statistics.
>

Right. There are many ways to cause issues like this.

>so we need to optimize the cost model for such case, the method is the
>patch I mentioned above.

Making the planner more robust w.r.t. to estimation errors is nice, but
I wouldn't go as far saying we should optimize for such cases. The stats
can be arbitrarily off, so should we expect the error to be 10%, 100% or
1000000%? We'd probably end up with plans that handle worst cases well,
but the average performance would end up being way worse :-(

Anyway, I kinda doubt making the conditions 1.001 more expensive is a
way to make the planning more robust. I'm pretty sure we could construct
examples in the opposite direction, in which case this change make it
more likely we use the wrong index.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-05-29 00:48:11 Re: Trouble with hashagg spill I/O pattern and costing
Previous Message Andres Freund 2020-05-28 22:28:13 Re: Fix compilation failure against LLVM 11