Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-26 19:21:27
Message-ID: CA+fd4k5Gr94EWr+wGRgjiKMk7kgmsnZ2y7d5+Vp+Sxb2bXuA_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 26 Nov 2019 at 13:34, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Nov 25, 2019 at 5:36 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > 2.
> > lazy_parallel_vacuum_or_cleanup_indexes()
> > {
> > ..
> > ..
> > }
> >
> > Here, it seems that we can increment/decrement the
> > VacuumActiveNWorkers even when there is no work performed by the
> > leader backend. How about moving increment/decrement inside function
> > vacuum_or_cleanup_indexes_worker? In that case, we need to do it in
> > this function when we are actually doing an index vacuum or cleanup.
> > After doing that the other usage of increment/decrement of
> > VacuumActiveNWorkers in other function heap_parallel_vacuum_main can
> > be removed.

Yeah we can move it inside vacuum_or_cleanup_indexes_worker but we
still need to increment the count before processing the indexes that
have skipped parallel operations because some workers might still be
running yet.

> >
>
> One of my colleague Mahendra who was testing this patch found that
> stats for index reported by view pg_statio_all_tables are wrong for
> parallel vacuum. I debugged the issue and found that there were two
> problems in the stats related code.
> 1. The function get_indstats seem to be computing the wrong value of
> stats for the last index.
> 2. The function lazy_parallel_vacuum_or_cleanup_indexes() was not
> pointing to the computed stats when the parallel index scan is
> skipped.
>
> Find the above two fixes in the attached patch. This is on top of the
> patches I sent yesterday [1].

Thank you! During testing the current patch by myself I also found this bug.

>
> Some more comments on v33-0002-Add-parallel-option-to-VACUUM-command
> -------------------------------------------------------------------------------------------------------------
> 1. The code in function lazy_parallel_vacuum_or_cleanup_indexes()
> that processes the indexes that have skipped parallel processing can
> be moved to a separate function. Further, the newly added code by the
> attached patch can also be moved to a separate function as the same
> code is used in function vacuum_or_cleanup_indexes_worker().
>
> 2.
> +void
> +heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
> {
> ..
> + stats = (IndexBulkDeleteResult **)
> + palloc0(nindexes * sizeof(IndexBulkDeleteResult *));
> ..
> }
>
> It would be neat if we free this memory once it is used.
>
> 3.
> + /*
> + * Compute the number of indexes that can participate to parallel index
> + * vacuuming.
> + */
>
> /to/in
>
> 4. The function lazy_parallel_vacuum_or_cleanup_indexes() launches
> workers without checking whether it needs to do the same or not. For
> ex. in cleanup phase, it is possible that we don't need to launch any
> worker, so it will be waste. It might be that you are already
> planning to handle it based on the previous comments/discussion in
> which case you can ignore this.

I've incorporated the comments I got so far including the above and
the memory alignment issue. Therefore the attached v34 patch includes
that changes and changes in v33-0002-delta-amit.patch and
v33-0002-delta2-fix-stats-issue.patch. In this version I add an extra
argument to LaunchParallelWorkers function and make the leader process
launch the parallel workers as much as the particular phase needs.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v34-0002-Add-parallel-option-to-VACUUM-command.patch application/octet-stream 75.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-11-26 20:05:59 Re: pglz performance
Previous Message Peter Eisentraut 2019-11-26 19:17:13 Re: pglz performance