Re: Use of additional index columns in rows filtering

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-08-08 21:03:58
Message-ID: CAH2-WzkM-9wdayo9vHta10QdZ1QuUuS5Gch7mtfBJtO_AeGStg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 8, 2023 at 1:49 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> So we expect 1250 rows. If that was accurate, the index scan would have
> to do 1250 heap fetches. It's just luck the index scan doesn't need to
> do that. I don't this there's a chance to improve this costing - if the
> inputs are this off, it can't do anything.

Well, that depends. If we can find a way to make the bitmap index scan
capable of doing something like the same trick through other means, in
some other patch, then this particular problem (involving a simple
inequality) just goes away. There may be other cases that look a
little similar, with a more complicated expression, where it just
isn't reasonable to expect a bitmap index scan to compete. Ideally,
bitmap index scans will only be at a huge disadvantage when it just
makes sense, due to the particulars of the expression.

I'm not trying to make this your problem. I'm just trying to establish
the general nature of the problem.

> Also, I think this is related to the earlier discussion about maybe
> costing it according to the worst case - i.e. as if we still needed
> fetch the same number of heap tuples as before. Which will inevitably
> lead to similar issues, with worse plans looking cheaper.

Not in those cases where it just doesn't come up, because we can
totally avoid visibility checks. As I said, securing that guarantee
has the potential to make the costing a lot more reliable/easier to
implement.

> That is certainly true - I'm trying to keep the scope somewhat close to
> the original goal. Obviously, there may be additional things the patch
> really needs to consider, but I'm not sure this is one of those cases
> (perhaps I just don't understand what the issue is - the example seems
> like a run-of-the-mill case of poor estimate / costing).

I'm not trying to impose any particular interpretation here. It's
early in the cycle, and my questions are mostly exploratory. I'm still
trying to develop my own understanding of the trade-offs in this area.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-08-08 22:04:24 Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Previous Message Peter Geoghegan 2023-08-08 20:54:36 Re: Use of additional index columns in rows filtering