Re: Limiting overshoot in nbtree's parallel SAOP index scans

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Limiting overshoot in nbtree's parallel SAOP index scans
Date: 2025-05-14 06:35:16
Message-ID: CAA4eK1+XpHGMTmifVBm1N+S+FWvc7NVRniZJm-3F+ZSuOA7ZcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 17, 2024 at 5:03 AM Matthias van de Meent
<boekewurm(at)gmail(dot)com> wrote:
>
> On Thu, 17 Oct 2024 at 00:33, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >
> > On Wed, Oct 16, 2024 at 5:48 PM Matthias van de Meent
> > <boekewurm(at)gmail(dot)com> wrote:
> > > In v17 and the master branch you'll note 16 buffer hits for the test
> > > query. However, when we use more expensive btree compare operations
> > > (e.g. by adding pg_usleep(1) to both btint8cmp and btint4cmp), the
> > > buffer access count starts to vary a lot and skyrockets to 30+ on my
> > > machine, in some cases reaching >100 buffer hits. After applying my
> > > patch, the buffer access count is capped to a much more agreeable
> > > 16-18 hits - it still shows signs of overshooting the serial bounds,
> > > but the number of buffers we overshoot our target is capped and thus
> > > significantly lower.
> >
> > It's not exactly capped, though. Since in any case you're always prone
> > to getting extra leaf page reads at the end of each primitive index
> > scan. That's not something that's new to Postgres 17, though.
>
> True, but the SAOP-enabled continued overshoot _is_ new: previously,
> each backend would only get up to one additional buffer access for
> every SOAP scan entry, while now it's only limited by outer SAOP
> bounds and index size.
>

IIUC, the problem you are trying to solve is that we will miss
primitive index scans in case of parallel index scans. The problem can
happen when the second parallel backend process (say b-2) has seized
the page after the first parallel backend (say b-1) has released the
current page and before b-1 could check whether it can schedule
another primitive index scan. And you want to fix it by delaying the
release of current page by b-1 in some cases, if so, then it has some
downsides as well as pointed out by Peter. Is my understanding
correct? If so, then we may also need to prove that releasing the page
at a later point doesn't harm any other cases.

I am not sure if I understood the problem completely, so can you
please explain it in a bit more layman's language, how it works before
17 and after 17?

BTW, I also want to clarify my understanding of primitive index scans
and how it has changed in PG17, is it related to how we optimize SAOP
scans by reducing the number of leaf page scans?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Kukushkin 2025-05-14 06:42:42 Re: Backward movement of confirmed_flush resulting in data duplication.
Previous Message Dilip Kumar 2025-05-14 06:29:00 Re: Backward movement of confirmed_flush resulting in data duplication.