From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Adding support for Row Compares to nbtree startikey optimization |
Date: | 2025-07-06 02:19:10 |
Message-ID: | CAH2-WznL6Z3H_GTQze9d8T_Ls=cYbnd-_9f-Jo7aYgTGRUD58g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Postgres 18 commit 8a510275 taught nbtree to apply page-level context
to determine an offset into the so->keyData[] array
(pstate.ikey/startikey) that works for the duration of a call to
_bt_readpage. Affected _bt_readpage calls have _bt_checkkeys calls
that use fewer CPU cycles, since all the _bt_checkkeys calls will be
able to start from the scan key/offset.
Postgres 18 never got support for row compare keys here:
_bt_set_startikey isn't willing to continue past a row compare key.
This seems like a missed opportunity. Row compares are generally a
decent target for the optimization in practice. They're at least as
good a target as scalar inequality keys were.
Attached patch remedies the situation by teaching _bt_set_startikey
about row compare members. It's reasonably straightforward, and makes
_bt_set_startikey "feature complete".
Testing has shown that it's about as effective as you'd expect -- it
generally manages to avoid evaluating row compares on scans that need
to read more than a couple of leaf pages. For example, if I run the
following test query:
select a, b, c
from fuzz_skip_scan
where (a, b, c) >= (11, 1, 71) and (a, b, c) <= (18, 1, 1)
order by a desc, b desc, c desc, d desc;
Custom instrumentation shows that most individual pages will be able
to skip/avoid evaluating *both* row compares, on *most* individual
pages. The first page read will never call _bt_set_startikey to begin
with (nothing new about that), and the last page will only be able to
skip one out of the two keys. That seems worth having.
The only notable limitation is that the optimization is ineffective on
pages where the deciding attribute (the attribute that most/all calls
to _bt_check_rowcompare end on) happens to contain NULLs. This is a
natural consequence of how row compares work: NULLs make it possible
for a tuple to be (say) ">=" a row compare qual, without the tuple
satisfying the qual, *and* without the NULLs indicating the absolute
end of matching tuples in the current scan direction. This is possible
with lower-order row compare member keys, at least.
I'll submit this to the next open commitfest for Postgres 19.
--
Peter Geoghegan
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Teach-nbtree-to-set-pstate.ikey-beyond-a-row-comp.patch | application/octet-stream | 6.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniil Davydov | 2025-07-06 08:00:32 | Re: POC: Parallel processing of indexes in autovacuum |
Previous Message | Tom Lane | 2025-07-05 22:27:44 | Re: C11 / VS 2019 |