Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2024-02-15 23:01:50
Message-ID: CAH2-Wzmcaf=uNhTGBj4jqXFSwbVwMw895JEMXr0bzcBdc8=FRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 22, 2024 at 4:13 PM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> Attached 2 patches: v11.patch-a and v11.patch-b. Both are incremental
> on top of your earlier set, and both don't allocate additional memory
> in the merge operation in non-assertion builds.
>
> patch-a is a trivial and clean implementation of mergesort, which
> tends to reduce the total number of compare operations if both arrays
> are of similar size and value range. It writes data directly back into
> the main array on non-assertion builds, and with assertions it reuses
> your binary search join on scratch space for algorithm validation.

This patch fails some of my tests on non-assert builds only (assert
builds pass all my tests, though). I'm using the first patch on its
own here.

While I tend to be relatively in favor of complicated assertions (I
tend to think the risk of introducing side-effects is worth it), it
looks like you're only performing certain steps in release builds.
This is evident from just looking at the code (there is an #else block
just for the release build in the loop). Note also that
"Assert(_bt_compare_array_elements(&merged[merged_nelems++], orig,
&cxt) == 0)" has side-effects in assert-enabled builds only (it
increments merged_nelems). While it's true that you *also* increment
merged_nelems *outside* of the assertion (or in an #else block used
during non-assert builds), that is conditioned on some other thing (so
it's in no way equivalent to the debug #ifdef USE_ASSERT_CHECKING
case). It's also just really hard to understand what's going on here.

If I was going to do this kind of thing, I'd use two completely
separate loops, that were obviously completely separate (maybe even
two functions). I'd then memcmp() each array at the end.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-02-15 23:08:56 Re: Transaction timeout
Previous Message Maiquel Grassi 2024-02-15 22:16:33 RE: Psql meta-command conninfo+