Quick Links

Re: Adding skip scan (including MDAM style range skip scan) to nbtree

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	BharatDB <bharatdbpg(at)gmail(dot)com>
Cc:	Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, rmt(at)lists(dot)postgresql(dot)org, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject:	Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Date:	2025-09-10 19:27:36
Message-ID:	CAH2-Wzmci2EeV=xDwMfyOs3uaXEqf5U6mpZ3e-boYe9g3q3kcw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Sep 10, 2025 at 2:49 AM BharatDB <bharatdbpg(at)gmail(dot)com> wrote:
> As a follow-up to the skip scan regression discussion, I tested a small patch that introduces static allocation/caching of `IndexAmRoutine` objects in `amapi.c`, removing the malloc/free overhead.

I think that it's too late to be considering anything this invasive for 18.

> Test setup :
> - Baseline: PG17 (commit before skip scan)
> - After: PG18 build with skip scan (patched)
> - pgbench scale=1, 100 partitions
> - Query: `select count(*) from pgbench_accounts where bid = 0`
> - Clients: 1, 4, 32
> - Protocols: simple, prepared
>
> Results (tps, 10s runs) :
>
> Mode Clients Before (PG17) After (PG18 w/ static fix)
>
> simple 1 23856 20332 (~15% lower)
> simple 4 55299 53184 (~4% lower)
> simple 32 79779 78347 (~2% lower)
>
> prepared 1 26364 26615 (no regression)
> prepared 4 55784 54437 (~2% lower)
> prepared 32 84687 80374 (~5% lower)
>
> This shows the static fix eliminates the severe ~50% regression previously observed by Tomas, leaving only a small residual slowdown (~2-15%).

The regression that Tomas reported is extreme and artificial. IIRC it
only affects partition queries with a hundred or so partitions, each
with an index-only scan that always scans exactly 0 index tuples, from
a pgbench_accounts that has the smallest possible amount of rows that
pgbench will allow (these are the cheapest possible index-only scans).
Plain index scans are not affected at all, presumably because it just
so happens that we don't allocate a BLCKSZ*2 workspace for plain index
scans, which is enough to put us well under the critical glibc
allocation size threshold (the threshold that the introduction of a
new nbtree support function put us over).

I also couldn't see anything like the 50% regression that Tomas
reported. And I couldn't recreate any problem unless partitioning was
used.

--
Peter Geoghegan

In response to

Re: Adding skip scan (including MDAM style range skip scan) to nbtree at 2025-09-10 06:49:31 from BharatDB

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Natalya Aksman	2025-09-10 19:41:30	Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Previous Message	Bharath Rupireddy	2025-09-10 19:23:03	Re: Proposal: Conflict log history table for Logical Replication