Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2024-02-15 23:36:24
Message-ID: CAH2-Wz=4C+X+9NBxMitPD=9XMUDdrAa6JxkV5NVQMRfrDnPsfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 23, 2024 at 3:22 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I could include something less verbose, mentioning a theoretical risk
> to out-of-core amcanorder routines that coevolved with nbtree,
> inherited the same SAOP limitations, and then never got the same set
> of fixes.

Attached is v11, which now says something like that in the commit
message. Other changes:

* Fixed buggy sorting of arrays using cross-type ORDER procs, by
recognizing that we need to consistently use same-type ORDER procs for
sorting and merging the arrays during array preprocessing.

Obviously, when we sort, we compare array elements to other array
elements (all of the same type). This is true independent of whether
the query itself happens to use a cross type operator/ORDER proc, so
we will need to do two separate ORDER proc lookups in cross-type
scenarios.

* No longer subscript the ORDER proc used for array binary searches
using a scankey subscript. Now there is an additional indirection that
works even in the presence of multiple redundant scan keys that cannot
be detected as such due to a lack of appropriate cross-type support
within an opfamily.

This was subtly buggy before now. Requires a little more coordination
between array preprocessing and standard/primitive index scan
preprocessing, which isn't ideal but seems unavoidable.

* Lots of code polishing, especially within _bt_advance_array_keys().

While _bt_advance_array_keys() still works in pretty much exactly the
same way as it did back in v10, there are now better comments.
Including something about why its recursive call to itself is
guaranteed to use a low, fixed amount of stack space, verified using
an assertion. That addresses a concern held by Matthias.

Outlook
=======

This patch is approaching being committable now. Current plan is to
commit this within the next few weeks.

All that really remains now is to research how we might integrate this
work with the recently added continuescanPrechecked/haveFirstMatch
stuff from Alexander Korotkov, if at all. I've put that off until now
because it isn't particularly fundamental to what I'm doing here, and
seems optional.

I would also like to do more performance validation. Things like the
parallel index scan code could stand to be revisited once again. Plus
I should think about the overhead of array preprocessing when
btrescan() is called many times, from a nested loop join -- I should
have something to say about that concern (raised by Heikki at one
point) before too long.

--
Peter Geoghegan

Attachment Content-Type Size
v11-0001-Enhance-nbtree-ScalarArrayOp-execution.patch application/octet-stream 180.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-02-16 00:00:00 Re: POC, WIP: OR-clause support for indexes
Previous Message Nathan Bossart 2024-02-15 23:32:19 Re: glibc qsort() vulnerability