Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2024-04-03 19:53:37
Message-ID: CAH2-Wz=vA9F1yS2gsc==EbV7rgJ21YyV0MEtV-Q_ELN9QPOD6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 1, 2024 at 6:33 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Note: v18 doesn't have any adjustments to the costing, as originally
> planned. I'll probably need to post a revised patch with improved (or
> at least polished) costing in the next few days, so that others will
> have the opportunity to comment before I commit the patch.

Attached is v19, which dealt with remaining concerns I had about the
costing in selfuncs.c. My current plan is to commit this on Saturday
morning (US Eastern time).

I ultimately concluded that selfuncs.c shouldn't be doing anything to
cap the estimated number of primitive index scans for an SAOP clause,
unless doing so is all but certain to make the estimate more accurate.
And so v19 makes the changes in selfuncs.c much closer to the approach
used to cost SOAP clauses on master. In short, v19 is a lot more
conservative in the changes made to selfuncs.c.

v19 does hold onto the idea of aggressively capping the estimate in
extreme cases -- cases where the estimate begins to approach the total
number of leaf pages in the index. This is clearly safe (it's
literally impossible to have more index descents than there are leaf
pages in the index), and probably necessary given that index paths
with SAOP filter quals (on indexed attributes) will no longer be
generated by the planner.

To recap, since nbtree more or less behaves as if scans with SAOPs are
one continuous index scan under the new design from patch, it was
tempting to make code in genericcostestimate/btcostestimate treat it
like that, too (the selfuncs.c changes from earlier versions tried to
do that). As I said already, I've now given up on that approach in
v19. This does mean that genericcostestimate continues to apply the
Mackert-Lohman formula for btree SAOP scans. This is a bit odd in a
world where nbtree specifically promises to not repeat any leaf page
accesses. I was a bit uneasy about that aspect for a while, but
ultimately concluded that it wasn't very important in the grand scheme
of things.

The problem with teaching code in genericcostestimate/btcostestimate
to treat nbtree scans with SAOPs are one continuous index scan is that
it hugely biases genericcostestimate's simplistic approach to
estimating numIndexPages. genericcostestimate does this by simply
prorating using numIndexTuples/index->pages/index->tuples. That naive
approach probably works alright with simple scalar operators, but it's
not going to work with SAOPs directly. I can't think of a good way of
estimating numIndexPages with SAOPs, and suspect that the required
information just isn't available. It seems best to treat it as a
possible area for improvement in a later release. The really important
thing for v17 is that we never wildly overestimate the number of
descents when multiple SAOPs are used, each with a medium or large
array.

v19 also makes sure that genericcostestimate doesn't allow a
non-boundary SAOP clause/non-required array scan key to affect
num_sa_scans -- it'll just use the num_sa_scans values used by
btcostestimate in v19. This makes sense because nbtree is guaranteed
to not start new primitive scans for these sorts of scan keys (plus
there's no need to calculate num_sa_scans twice, in two places, using
two slightly different approaches).

--
Peter Geoghegan

Attachment Content-Type Size
v19-0001-Enhance-nbtree-ScalarArrayOp-execution.patch application/octet-stream 209.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-03 20:02:47 Re: Statistics Import and Export
Previous Message Robert Haas 2024-04-03 19:49:09 Re: psql not responding to SIGINT upon db reconnection