Re: Add index scan progress to pg_stat_progress_vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-06-20 06:35:00
Message-ID: CAD21AoDnZrP-K6mC9oJ4cpoYNXeLOu09EET2zy564ke8u3RNrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 6, 2022 at 11:42 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, May 26, 2022 at 11:43 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Another idea I came up with is that we can wait for all index vacuums
> > to finish while checking and updating the progress information, and
> > then calls WaitForParallelWorkersToFinish after confirming all index
> > status became COMPLETED. That way, we don’t need to change the
> > parallel query infrastructure. What do you think?
>
> +1 from me. It doesn't seem to me that we should need to add something
> like parallel_vacuum_progress_callback in order to solve this problem,
> because the parallel index vacuum code could just do the waiting
> itself, as you propose here.
>
> The question Sami asks him his reply is a good one, though -- who is
> to say that the leader only needs to update progress at the end, once
> it's finished the index it's handling locally? There will need to be a
> callback system of some kind to allow the leader to update progress as
> other workers finish, even if the leader is still working. I am not
> too sure that the idea of using the vacuum delay points is the best
> plan. I think we should try to avoid piggybacking on such general
> infrastructure if we can, and instead look for a way to tie this to
> something that is specific to parallel vacuum. However, I haven't
> studied the problem so I'm not sure whether there's a reasonable way
> to do that.

One idea would be to add a flag, say report_parallel_vacuum_progress,
to IndexVacuumInfo struct and expect index AM to check and update the
parallel index vacuum progress, say every 1GB blocks processed. The
flag is true only when the leader process is vacuuming an index.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shiy.fnst@fujitsu.com 2022-06-20 06:46:47 RE: Replica Identity check of partition table on subscriber
Previous Message Amit Kapila 2022-06-20 05:33:09 Re: Replica Identity check of partition table on subscriber