Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2023-11-28 23:52:31
Message-ID: CAH2-Wz=paSJk31VtMjYhHLAChCuD-Q=Qds-fTRjpC_L0dzOiXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 28, 2023 at 9:19 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I'm not convinced this is a problem we have to solve. It's possible it
> > only affects cases that are implausible in practice (the script forces a
> > particular scan type, and maybe it would not be picked in practice). But
> > maybe it's fixable ...
>
> I would expect the patch to do quite well (relative to what is
> actually possible) on cases like the two extremes that I've focussed
> on so far. It seems possible that it will do less well on cases that
> are somewhere in the middle (that also have lots of distinct values on
> each page).

Actually, I think that it's more likely that the problems that you saw
are related to low cardinality data, which seems like it might not be
a great fit for the heuristics that the patch uses to decide whether
to continue the ongoing primitive index scan, or start a new one
instead. I'm referring to the heuristics I describe here:

https://postgr.es/m/CAH2-WzmTHoCsOmSgLg=yyft9LoERtuCKXyG2GZn+28PzonFA_g@mail.gmail.com

The patch itself discusses these heuristics in a large comment block
after the point that _bt_checkkeys() calls _bt_advance_array_keys().

I hardly paid any attention to low cardinality data in my performance
validation work -- it was almost always indexes that had few or no
indexes (just pgbench_accounts if we're talking pure stress-tests),
just because those are more complicated, and so seemed more important.
I'm not quite prepared to say that there is definitely a problem here,
right this minute, but if there was then it wouldn't be terribly
surprising (the problems are usually wherever it is that I didn't look
for them).

Attached is a sample of my debug instrumentation for one such query,
based on running the test script that Tomas posted -- thanks for
writing this script, Tomas (I'll use it as the basis for some of my
own performance validation work going forward). I don't mind sharing
the patch that outputs this stuff if anybody is interested (it's kind
of a monstrosity, so I'm disinclined to post it with the patch until I
have a reason). Even without this instrumentation, you can get some
idea of the kinds of issues I'm talking about just by viewing EXPLAIN
ANALYZE output for a bitmap index scan -- that breaks out the index
page accesses separately, which is a number that we should expect to
remain lower than what the master branch shows in approximately all
cases.

While I still think that we need heuristics that apply speculative
criteria to decide whether or not going to the next page directly
(when we have a choice at all), that doesn't mean that the v7
heuristics can't be improved on, with a little more thought. It's a
bit tricky, since we're probably also benefiting from the same
heuristics all the time -- probably even for this same test case. We
do lose against the master branch, on balance, and by enough to
concern me, though. (I don't want to promise that it'll never happen
at all, but it should be very limited, which this wasn't.)

I didn't bother to ascertain how much longer it takes to execute the
query, since that question seems rather beside the point. The
important thing to me is whether or not this behavior actually makes
sense, all things considered, and what exactly can be done about it if
it doesn't make sense. I will need to think about this some more. This
is just a status update.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
index_walk_dump.txt.bz2 application/x-bzip2 86.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-29 00:03:49 Re: Fix assertion in autovacuum worker
Previous Message Robert Haas 2023-11-28 23:22:26 Re: [HACKERS] Changing references of password encryption to hashing