Re: Confine vacuum skip logic to lazy_scan_skip

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-03-11 21:02:57
Message-ID: CAAKRu_byDppRvNJ+p5kq-SmXZDQuL6L8N6D06pHbO6WnBnanWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 10, 2024 at 11:01 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > I have investigated the interaction between
> > maintenance_io_concurrency, streaming reads, and the vacuum buffer
> > access strategy (BAS_VACUUM).
> >
> > The streaming read API limits max_pinned_buffers to a pinned buffer
> > multiplier (currently 4) * maintenance_io_concurrency buffers with the
> > goal of constructing reads of at least MAX_BUFFERS_PER_TRANSFER size.
> >
> > Since the BAS_VACUUM ring buffer is size 256 kB or 32 buffers with
> > default block size, that means that for a fully uncached vacuum in
> > which all blocks must be vacuumed and will be dirtied, you'd have to
> > set maintenance_io_concurrency at 8 or lower to see the same number of
> > reuses (and shared buffer consumption) as master.
> >
> > Given that we allow users to specify BUFFER_USAGE_LIMIT to vacuum, it
> > seems like we should force max_pinned_buffers to a value that
> > guarantees the expected shared buffer usage by vacuum. But that means
> > that maintenance_io_concurrency does not have a predictable impact on
> > streaming read vacuum.
> >
> > What is the right thing to do here?
> >
> > At the least, the default size of the BAS_VACUUM ring buffer should be
> > BLCKSZ * pinned_buffer_multiplier * default maintenance_io_concurrency
> > (probably rounded up to the next power of two) bytes.
>
> Hmm, does the v6 look-ahead distance control algorithm mitigate that
> problem? Using the ABC classifications from the streaming read
> thread, I think for A it should now pin only 1, for B 16 and for C, it
> depends on the size of the random 'chunks': if you have a lot of size
> 1 random reads then it shouldn't go above 10 because of (default)
> maintenance_io_concurrency. The only way to get up to very high
> numbers would be to have a lot of random chunks triggering behaviour
> C, but each made up of long runs of misses. For example one can
> contrive a BHS query that happens to read pages 0-15 then 20-35 then
> 40-55 etc etc so that we want to get lots of wide I/Os running
> concurrently. Unless vacuum manages to do something like that, it
> shouldn't be able to exceed 32 buffers very easily.
>
> I suspect that if we taught streaming_read.c to ask the
> BufferAccessStrategy (if one is passed in) what its recommended pin
> limit is (strategy->nbuffers?), we could just clamp
> max_pinned_buffers, and it would be hard to find a workload where that
> makes a difference, and we could think about more complicated logic
> later.
>
> In other words, I think/hope your complaints about excessive pinning
> from v5 WRT all-cached heap scans might have also already improved
> this case by happy coincidence? I haven't tried it out though, I just
> read your description of the problem...

I've rebased the attached v10 over top of the changes to
lazy_scan_heap() Heikki just committed and over the v6 streaming read
patch set. I started testing them and see that you are right, we no
longer pin too many buffers. However, the uncached example below is
now slower with streaming read than on master -- it looks to be
because it is doing twice as many WAL writes and syncs. I'm still
investigating why that is.

psql \
-c "create table small (a int) with (autovacuum_enabled=false,
fillfactor=25);" \
-c "insert into small select generate_series(1,200000) % 3;" \
-c "update small set a = 6 where a = 1;"

pg_ctl stop
# drop caches
pg_ctl start

psql -c "\timing on" -c "vacuum (verbose) small"

- Melanie

Attachment Content-Type Size
v10-0001-Streaming-Read-API.patch text/x-patch 55.8 KB
v10-0003-Vacuum-second-pass-uses-Streaming-Read-interface.patch text/x-patch 8.5 KB
v10-0002-Vacuum-first-pass-uses-Streaming-Read-interface.patch text/x-patch 8.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-03-11 21:12:04 Re: Reports on obsolete Postgres versions
Previous Message Matthias van de Meent 2024-03-11 20:52:26 Re: Reducing output size of nodeToString