Re: Add index scan progress to pg_stat_progress_vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2023-04-05 08:21:02
Message-ID: CAD21AoD-z3-qOPbv9ijVRKhXhhnfwziwLK=gin81CJUdEp5fqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 5, 2023 at 4:47 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Fri, Feb 24, 2023 at 03:16:10PM +0900, Masahiko Sawada wrote:
> > Thanks! It looks good to me so I've marked it as Ready for Committer.
>
> + case 'P': /* Parallel progress reporting */
> + {
> + /* Call the progress reporting callback */
> + Assert(pcxt->parallel_progress_callback);
> + pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);
> +
> + break;
> + }
>
> The key point of the patch is here. From what I understand based on
> the information of the thread, this is used as a way to make the
> progress reporting done by the leader more responsive so as we'd
> update the index counters each time the leader is poked at with a 'P'
> message by one of its workers, once a worker is done with the parallel
> cleanup of one of the indexes. That's appealing, because this design
> is responsive and cheap, while we'd rely on CFIs to make sure that the
> leader triggers its callback on a timely basis. Unfortunately,
> sticking a concept of "Parallel progress reporting" is rather
> confusing here? This stuff can be used for much more purposes than
> just progress reporting: the leader would execute the callback it has
> registered based on the timing given by one or more of its workers,
> these willing to push an event on the leader. Progress reporting is
> one application of that to force a refresh and make the information of
> the leader accurate. What about things like a chain of callbacks, for
> example? Could the leader want to register more than one callback and
> act on all of them with one single P message?

That seems a valid argument. I was thinking that such an asynchronous
state update mechanism would be a good infrastructure for progress
reporting of parallel operations. It might be worth considering to use
it in more general usage but since the current implementation is
minimal can we extend it in the future when we need it for other use
cases?

>
> Another question I have: could the reporting of each individual worker
> make sense on its own? The cleanup of the indexes depends on the
> order they are processed, their number, size and AM with their cleanup
> strategy, still there may be a point in getting information about how
> much work a single worker is doing rather than just have the
> aggregated information given to the leader?

It would also be useful information for users but I don't think it can
alternate the aggregated information. The aggregated information can
answer the question from the user like "how many indexes to vacuum are
remaining?", which helps estimate the remaining time to complete.

>
> Btw, Is an assertion really helpful here? If
> parallel_progress_callback is not set, we'd just crash one line
> after. It seems to me that it could be cleaner to do nothing if a
> leader gets a poke message from a worker if there are no callbacks
> registered.

Agreed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-04-05 08:57:18 Re: GUC for temporarily disabling event triggers
Previous Message Michael Paquier 2023-04-05 08:10:32 Re: GUC for temporarily disabling event triggers