Re: parallel vacuum comments

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: parallel vacuum comments
Date: 2021-12-09 12:34:36
Message-ID: CAD21AoByWJ675-FLnY+rqk494zF66PeCo7Y14heY8=VRz--9QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 9, 2021 at 7:44 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Dec 9, 2021 at 3:35 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Dec 6, 2021 at 10:17 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Dec 3, 2021 at 6:06 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > 2. The patch seems to be calling parallel_vacuum_should_skip_index
> > > > > thrice even before starting parallel vacuum. It has a call to find the
> > > > > number of blocks which has to be performed for each index. I
> > > > > understand it might not be too costly to call this but it seems better
> > > > > to remember this info like we are doing in the current code.
> > > >
> > > > Yes, we can bring will_vacuum_parallel array back to the code. That
> > > > way, we can remove the call to parallel_vacuum_should_skip_index() in
> > > > parallel_vacuum_begin().
> > > >
> > > > > We can
> > > > > probably set parallel_workers_can_process in parallel_vacuum_begin and
> > > > > then again update in parallel_vacuum_process_all_indexes. Won't doing
> > > > > something like that be better?
> > > >
> > > > parallel_workers_can_process can vary depending on bulk-deletion, the
> > > > first time cleanup, or the second time (or more) cleanup. What can we
> > > > set parallel_workers_can_process based on in parallel_vacuum_begin()?
> > > >
> > >
> > > I was thinking to set the results of will_vacuum_parallel in
> > > parallel_vacuum_begin().
> > >
> >
> > This point doesn't seem to be addressed in the latest version (v6). Is
> > there a reason for not doing it? If we do this, then we don't need to
> > call parallel_vacuum_should_skip_index() from
> > parallel_vacuum_index_is_parallel_safe().
> >
>
> Few minor comments on v6-0001
> ==========================
> 1.
> The array
> + * element is allocated for every index, even those indexes where
> + * parallel index vacuuming is unsafe or not worthwhile (i.g.,
> + * parallel_vacuum_should_skip_index() returns true).
>
> /i.g/e.g
>
> 2.
> static void update_index_statistics(LVRelState *vacrel);
> -static void begin_parallel_vacuum(LVRelState *vacrel, int nrequested);
> -static void end_parallel_vacuum(LVRelState *vacrel);
> -static LVSharedIndStats *parallel_stats_for_idx(LVShared *lvshared,
> int getidx);
> -static bool parallel_processing_is_safe(Relation indrel, LVShared *lvshared);
> +
> +static int parallel_vacuum_compute_workers(LVRelState *vacrel, int nrequested,
> + bool *will_parallel_vacuum);
>
> In declaration, parallel_vacuum_compute_workers() is declared after
> update_index_statistics but later defined in reverse order. I suggest
> to make the order of definitions same as their declaration. Similarly,
> the order of definition of parallel_vacuum_process_all_indexes(),
> parallel_vacuum_process_unsafe_indexes(),
> parallel_vacuum_process_safe_indexes(),
> parallel_vacuum_process_one_index() doesn't match the order of their
> declaration. Can we change that as well?

Agreed with the above two points.

I've attached updated patches that incorporated the above comments
too. Please review them.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
v7-0001-Refactor-parallel-vacuum-to-remove-bitmap-related.patch application/octet-stream 36.9 KB
v7-0002-Move-parallel-vacuum-code-to-vacuumparallel.c.patch application/octet-stream 91.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2021-12-09 12:42:59 Documenting when to retry on serialization failure
Previous Message Alvaro Herrera 2021-12-09 12:30:34 Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display