Re: Confine vacuum skip logic to lazy_scan_skip

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-03-11 03:01:16
Message-ID: CA+hUKGKHb3i8Wy72VCKZGA2B5djoj7tAzYAzbeS=Gwr_SdhgRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> On Wed, Mar 6, 2024 at 6:47 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > Performance results:
> >
> > The TL;DR of my performance results is that streaming read vacuum is
> > faster. However there is an issue with the interaction of the streaming
> > read code and the vacuum buffer access strategy which must be addressed.

Woo.

> I have investigated the interaction between
> maintenance_io_concurrency, streaming reads, and the vacuum buffer
> access strategy (BAS_VACUUM).
>
> The streaming read API limits max_pinned_buffers to a pinned buffer
> multiplier (currently 4) * maintenance_io_concurrency buffers with the
> goal of constructing reads of at least MAX_BUFFERS_PER_TRANSFER size.
>
> Since the BAS_VACUUM ring buffer is size 256 kB or 32 buffers with
> default block size, that means that for a fully uncached vacuum in
> which all blocks must be vacuumed and will be dirtied, you'd have to
> set maintenance_io_concurrency at 8 or lower to see the same number of
> reuses (and shared buffer consumption) as master.
>
> Given that we allow users to specify BUFFER_USAGE_LIMIT to vacuum, it
> seems like we should force max_pinned_buffers to a value that
> guarantees the expected shared buffer usage by vacuum. But that means
> that maintenance_io_concurrency does not have a predictable impact on
> streaming read vacuum.
>
> What is the right thing to do here?
>
> At the least, the default size of the BAS_VACUUM ring buffer should be
> BLCKSZ * pinned_buffer_multiplier * default maintenance_io_concurrency
> (probably rounded up to the next power of two) bytes.

Hmm, does the v6 look-ahead distance control algorithm mitigate that
problem? Using the ABC classifications from the streaming read
thread, I think for A it should now pin only 1, for B 16 and for C, it
depends on the size of the random 'chunks': if you have a lot of size
1 random reads then it shouldn't go above 10 because of (default)
maintenance_io_concurrency. The only way to get up to very high
numbers would be to have a lot of random chunks triggering behaviour
C, but each made up of long runs of misses. For example one can
contrive a BHS query that happens to read pages 0-15 then 20-35 then
40-55 etc etc so that we want to get lots of wide I/Os running
concurrently. Unless vacuum manages to do something like that, it
shouldn't be able to exceed 32 buffers very easily.

I suspect that if we taught streaming_read.c to ask the
BufferAccessStrategy (if one is passed in) what its recommended pin
limit is (strategy->nbuffers?), we could just clamp
max_pinned_buffers, and it would be hard to find a workload where that
makes a difference, and we could think about more complicated logic
later.

In other words, I think/hope your complaints about excessive pinning
from v5 WRT all-cached heap scans might have also already improved
this case by happy coincidence? I haven't tried it out though, I just
read your description of the problem...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-03-11 03:19:52 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Laurenz Albe 2024-03-11 02:43:58 Re: Reducing the log spam