Re: Add index scan progress to pg_stat_progress_vacuum

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-10-14 20:05:41
Message-ID: C0DFF2A7-11AA-4613-8ECC-C4AECFF67911@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the feedback!

> While it seems to be a good idea to have the atomic counter for the
> number of indexes completed, I think we should not use the global
> variable referencing the counter as follow:

> +static pg_atomic_uint32 *index_vacuum_completed = NULL;
> :
> +void
> +parallel_vacuum_progress_report(void)
> +{
> + if (IsParallelWorker() || !report_parallel_vacuum_progress)
> + return;
> +
> + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
> + pg_atomic_read_u32(index_vacuum_completed));
> +}

> I think we can pass ParallelVacuumState (or PVIndStats) to the
> reporting function so that it can check the counter and report the
> progress.

Yes, that makes sense.

> ---
> I am not too sure that the idea of using the vacuum delay points is the best
> plan. I think we should try to avoid piggybacking on such general
> infrastructure if we can, and instead look for a way to tie this to
> something that is specific to parallel vacuum.
> ---

Adding the call to vacuum_delay_point made sense since it's
called in all major vacuum scans. While it is also called
by analyze, it will only do anything if the caller sets a flag
to report_parallel_vacuum_progress.

> Instead, I think we can add a boolean and the pointer for
> ParallelVacuumState to IndexVacuumInfo. If the boolean is true, an
> index AM can call the reporting function with the pointer to
> ParallelVacuumState while scanning index blocks, for example, for
> every 1GB. The boolean can be true only for the leader process.

Will doing this every 1GB be necessary? Considering the function
will not do much more than update progress from the value
stored in index_vacuum_completed?

> Agreed, but with the following change, the leader process waits in a
> busy loop, which should not be acceptable:

Good point.

> Also, I think it's better to check whether idx_completed_progress
equals to the number indexes instead.

Good point

> 5. Went back to the idea of adding a new view called pg_stat_progress_vacuum_index
> which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h

> Probably, we can devide the patch into two patches. One for adding the

Makes sense.

Thanks

Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2022-10-14 21:09:46 Re: New "single-call SRF" APIs are very confusingly named
Previous Message Bruce Momjian 2022-10-14 19:51:16 Re: New docs chapter on Transaction Management and related changes