Re: parallel vacuum comments

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel vacuum comments
Date: 2021-11-04 19:42:37
Message-ID: CAH2-WzmU8+_Vpx76fWBVdfKFFwowQ81mmYsH5rDDYnbxpSjueQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 3, 2021 at 10:25 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> I've attached a draft patch. The patch incorporated all comments from
> Andres except for the last comment that moves parallel related code to
> another file. I'd like to discuss how we split vacuumlazy.c.

This looks great!

I wonder if this is okay, though:

> /* Process the indexes that can be processed by only leader process */
> - do_serial_processing_for_unsafe_indexes(vacrel, lps->lvshared);
> + lazy_serial_process_indexes(vacrel);
>
> /*
> - * Join as a parallel worker. The leader process alone processes all the
> - * indexes in the case where no workers are launched.
> + * Join as a parallel worker. The leader process alone processes all
> + * parallel-safe indexes in the case where no workers are launched.
> */
> - do_parallel_processing(vacrel, lps->lvshared);
> + lazy_parallel_process_indexes(vacrel, lps->lvshared, vacrel->lps->lvsharedindstats);
>
> /*
> * Next, accumulate buffer and WAL usage. (This must wait for the workers
> @@ -2786,6 +2734,18 @@ do_parallel_vacuum_or_cleanup(LVRelState *vacrel, int nworkers)
> InstrAccumParallelQuery(&lps->buffer_usage[i], &lps->wal_usage[i]);
> }

Since "The leader process alone processes all parallel-safe indexes in
the case where no workers are launched" (no change there), I wonder:
how does the leader *know* that it's the leader (and so can always
process any indexes) inside its call to
lazy_parallel_process_indexes()? Or, does the leader actually process
all indexes inside lazy_serial_process_indexes() in this edge case?
(The edge case where a parallel VACUUM has no workers at all, because
they couldn't be launched by the core parallel infrastructure.)

One small thing: the new "LVSharedIndStats.parallel_safe" field seems
to be slightly misnamed. Isn't it more like
"LVSharedIndStats.parallel_workers_can_process"? The index might
actually be parallel safe *in principle*, while nevertheless being
deliberately skipped over by workers due to a deliberate up-front
choice made earlier, in compute_parallel_vacuum_workers(). Most
obvious example of this is the choice to not use parallelism for a
smaller index (say a partial index whose size is <
min_parallel_index_scan_size).

Another small thing, which is closely related to the last one:

> typedef struct LVSharedIndStats
> {
> - bool updated; /* are the stats updated? */
> + LVIndVacStatus status;
> +
> + /*
> + * True if both leader and worker can process the index, otherwise only
> + * leader can process it.
> + */
> + bool parallel_safe;
> +
> + bool istat_updated; /* are the stats updated? */
> IndexBulkDeleteResult istat;
> } LVSharedIndStats;

It would be nice to point out that the new
"LVSharedIndStats.parallel_safe" field (or whatever we end up calling
it) had comments that pointed out that it isn't a fixed thing for the
entire VACUUM operation -- it's only fixed for an individual call to
parallel_lazy_vacuum_or_cleanup_all_indexes() (i.e., it's only fixed
for the "ambulkdelete portion" or the "amvacuumcleanup portion" of the
entire VACUUM).

Alternatively, you could just have two booleans, I think. You know,
one for the "ambulkdelete portion", another for the "amvacuumcleanup
portion". As I've said before, it would be nice if we only called
parallel_vacuum_main() once per VACUUM operation (and not once per
"portion"), but that's a harder and more invasive change. But I don't
think you necessarily have to go that far for it to make sense to have
two bools. Having two might allow you to make both of them immutable,
which is useful.

> Regarding tests, I’d like to add tests to check if a vacuum with
> multiple index scans (i.g., due to small maintenance_work_mem) works
> fine. But a problem is that we need at least about 200,000 garbage
> tuples to perform index scan twice even with the minimum
> maintenance_work_mem. Which takes a time to load tuples.

Maybe that's okay. Do you notice that it takes a lot longer now? I did
try to keep the runtime down when I committed the fixup to the
parallel VACUUM related bug.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-11-04 19:54:43 Re: Time to drop plpython2?
Previous Message Andres Freund 2021-11-04 19:00:03 Re: parallel vacuum comments