Re: Use of additional index columns in rows filtering

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-08-03 19:21:33
Message-ID: CAH2-WznE6QDHu0=NnXVfVF+wziv4O=57ueK2+hQkE2Q+KO9=Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 3, 2023 at 11:17 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> Not sure. I'm a bit confused about what exactly is so risky on the plan
> produced with the patch.

It's all about the worst case. In the scenarios that I'm concerned
about, we can be quite sure that the saving from not using a BitmapOr
will be fairly low -- the cost of not having to repeat the same index
page accesses across several similar index scans is, at best, some
small multiple of the would-be number of index scans that the BitmapOr
plan gets. We can be certain that the possible benefits are fixed and
low. This is always true; presumably the would-be BitmapOr plan can
never have all that many index scans. And we always know how many
index scans a BitmapOr plan would use up front.

On the other hand, the possible downsides have no obvious limit. So
even if we're almost certain to win on average, we only have to be
unlucky once to lose all we gained before that point. As a general
rule, we want the index AM to have all the context required to
terminate its scan at the earliest possible opportunity. This is
enormously important in the worst case.

It's easier for me to make this argument because I know that we don't
really need to make any trade-off here. But even if that wasn't the
case, I'd probably arrive at the same general conclusion.

Importantly, it isn't possible to make a similar argument that works
in the opposite direction -- IMV that's the difference between this
flavor of riskiness, and the inevitable riskiness that comes with any
planner change. In other words, your patch isn't going to win by an
unpredictably high amount. Not in the specific scenarios that I'm
focussed on here, with a BitmapOr + multiple index scans getting
displaced.

The certainty about the upside is just as important as the uncertainty
about the downside. The huge asymmetry matters, and is fairly
atypical. If, somehow, there was less certainty about the possible
upside, then my argument wouldn't really work.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2023-08-03 19:25:52 Re: Using defines for protocol characters
Previous Message Tom Lane 2023-08-03 19:13:47 Re: Extract numeric filed in JSONB more effectively