Re: Parallel heap vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Parallel heap vacuum
Date: 2025-06-13 20:05:55
Message-ID: CAD21AoB2zAeGjAARs5uExy66ZrYpfL5h_PQHQuCE6Dm=3rvSUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 28, 2025 at 11:07 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Apr 5, 2025 at 1:17 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > On Fri, Apr 4, 2025 at 5:35 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > > I haven't looked closely at this version but I did notice that you do
> > > > not document that parallel vacuum disables eager scanning. Imagine you
> > > > are a user who has set the eager freeze related table storage option
> > > > (vacuum_max_eager_freeze_failure_rate) and you schedule a regular
> > > > parallel vacuum. Now that table storage option does nothing.
> > >
> > > Good point. That restriction should be mentioned in the documentation.
> > > I'll update the patch.
> >
> > Yea, I mean, to be honest, when I initially replied to the thread
> > saying I thought temporarily disabling eager scanning for parallel
> > heap vacuuming was viable, I hadn't looked at the patch yet and
> > thought that there was a separate way to enable the new parallel heap
> > vacuum (separate from the parallel option for the existing parallel
> > index vacuuming). I don't like that this disables functionality that
> > worked when I pushed the eager scanning feature.
>
> Thank you for sharing your opinion. I think this is one of the main
> points that we need to deal with to complete this patch.
>
> After considering various approaches to integrating parallel heap
> vacuum and eager freeze scanning, one viable solution would be to
> implement a dedicated parallel scan mechanism for parallel lazy scan,
> rather than relying on the table_block_parallelscan_xxx() facility.
> This approach would involve dividing the table into chunks of 4,096
> blocks, same as eager freeze scanning, where each parallel worker
> would perform eager freeze scanning while maintaining its own local
> failure count and a shared success count. This straightforward
> approach offers an additional advantage: since the chunk size remains
> constant, we can implement the SKIP_PAGES_THRESHOLD optimization
> consistently throughout the table, including its final sections.
>
> However, this approach does present certain challenges. First, we
> would need to maintain a separate implementation of lazy vacuum's
> parallel scan alongside the table_block_parallelscan_XXX() facility,
> potentially increasing maintenance overhead. Additionally, the fixed
> chunk size across the entire table might prove less efficient when
> processing blocks near the table's end compared to the dynamic
> approach used by table_block_parallelscan_nextpage().
>

I've attached the updated patches for parallel heap vacuum. This
version includes several updates:

We can use eager scanning mechanisms even during parallel heap vacuum.
The table is divided into a fixed size chunk (1024 blocks) each of
which is assigned to a parallel vacuum worker. The eager scanning
failure count is evenly divided into chunks as the sizes of region and
chunk are different.

The 0005 patches added a new parallel heap vacuum test to improve the
coverage. Specifically, it tests the case using injection points where
the leader launches fewer parallel workers during multiple index
scans, having the leader complete the unfinished scans at the end of
the lazy scan heap.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v17-0001-Introduces-table-AM-APIs-for-parallel-table-vacu.patch application/octet-stream 6.3 KB
v17-0005-Add-more-parallel-vacuum-tests.patch application/octet-stream 7.9 KB
v17-0002-vacuumparallel.c-Support-parallel-vacuuming-for-.patch application/octet-stream 27.1 KB
v17-0003-Move-lazy-heap-scan-related-variables-to-new-str.patch application/octet-stream 28.0 KB
v17-0004-Support-parallelism-for-collecting-dead-items-du.patch application/octet-stream 63.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Koval 2025-06-13 20:06:55 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Previous Message Perumal Raj 2025-06-13 17:22:27 Re: Logical Replication slot disappeared after promote Standby