Re: Add index scan progress to pg_stat_progress_vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-05-26 15:43:12
Message-ID: CAD21AoDqVNhLyegNGGezXVhhkWnGHN84TJzeuRbqU077U_KRpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 6, 2022 at 4:26 AM Imseih (AWS), Sami <simseih(at)amazon(dot)com> wrote:
>
> Thank you for the feedback!
>
> > I think we can pass the progress update function to
> > WaitForParallelWorkersToFinish(), which seems simpler. And we can call
>
> Directly passing the callback to WaitForParallelWorkersToFinish
> will require us to modify the function signature.
>
> To me, it seemed simpler and touches less code to have
> the caller set the callback in the ParallelContext.

Okay, but if we do that, I think we should add comments about when
it's used. The callback is used only when
WaitForParallelWorkersToFinish(), but not when
WaitForParallelWorkersToExit().

Another idea I came up with is that we can wait for all index vacuums
to finish while checking and updating the progress information, and
then calls WaitForParallelWorkersToFinish after confirming all index
status became COMPLETED. That way, we don’t need to change the
parallel query infrastructure. What do you think?

>
> > the function after updating the index status to
> > PARALLEL_INDVAC_STATUS_COMPLETED.
>
> I also like this better. Will make the change.
>
> > BTW, currently we don't need a lock for touching index status since
> > each worker touches different indexes. But after this patch, the
> > leader will touch all index status, do we need a lock for that?
>
> I do not think locking is needed here. The leader and workers
> will continue to touch different indexes to update the status.
>
> However, if the process is a leader, it will call the function
> which will go through indstats and count how many
> Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED.
> This value is then reported to the leaders backend only.

I was concerned that the leader process could report the wrong
progress if updating and checking index status happen concurrently.
But I think it should be fine since we can read PVIndVacStatus
atomically.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-05-26 16:20:37 Re: Add --{no-,}bypassrls flags to createuser
Previous Message Justin Pryzby 2022-05-26 15:43:11 Re: Remove support for Visual Studio 2013