Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.
Date: 2020-05-29 13:58:49
Message-ID: CAKU4AWre9=+iFdfhsJy-6wJn8OTdNACfbuWntj6J2vsG6VcHcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 29, 2020 at 9:37 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
wrote:

> On Fri, May 29, 2020 at 6:40 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> >
> >
> >>
> >> >so we need to optimize the cost model for such case, the method is the
> >> >patch I mentioned above.
> >>
> >> Making the planner more robust w.r.t. to estimation errors is nice, but
> >> I wouldn't go as far saying we should optimize for such cases. The stats
> >> can be arbitrarily off, so should we expect the error to be 10%, 100% or
> >> 1000000%?
> >
> >
> > I don't think my patch relay on anything like that. My patch doesn't
> fix the
> > statistics issue, just adding the extra cost on qual cost on Index
> Filter part.
> > Assume the query pattern are where col1= X and col2 = Y. The impacts are
> :
> > 1). Make the cost of (col1, other_column) is higher than (col1, col2)
> > 2). The relationship between seqscan and index scan on index (col1,
> other_column)
> > is changed, (this is something I don't want). However my cost
> difference between
> > index scan & seq scan usually very huge, so the change above should has
> > nearly no impact on that choice. 3). Make the cost higher index scan
> for
> > Index (col1) only. Overall I think nothing will make thing worse.
>
> When the statistics is almost correct (or better than what you have in
> your example), the index which does not cover all the columns in all
> the conditions will be expensive anyways because of extra cost to
> access heap for the extra rows not filtered by that index. An index
> covering all the conditions would have its scan cost cheaper since
> there will be fewer rows and hence fewer heap page accesses because of
> more filtering. So I don't think we need any change in the current

costing model.
>

Thank you for your reply. Looks you comments is based on the statistics
is almost correct (or better than what I have in my example), That is
true.
However my goal is to figure out a way which can generate better plan even
the statistics is not correct (the statistics with such issue is not very
uncommon,
I just run into one such case and spend 1 week to handle some
non-technology
stuff after that). I think the current issue is even my patch can make
the worst case
better, we need to make sure the average performance not worse.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-05-29 14:11:52 Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.
Previous Message David G. Johnston 2020-05-29 13:56:36 Re: pg_dump fail to not dump public schema orders