Re: Add index scan progress to pg_stat_progress_vacuum

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2023-04-05 07:47:25
Message-ID: ZC0njb/GuPrHQnS3@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 24, 2023 at 03:16:10PM +0900, Masahiko Sawada wrote:
> Thanks! It looks good to me so I've marked it as Ready for Committer.

+ case 'P': /* Parallel progress reporting */
+ {
+ /* Call the progress reporting callback */
+ Assert(pcxt->parallel_progress_callback);
+ pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);
+
+ break;
+ }

The key point of the patch is here. From what I understand based on
the information of the thread, this is used as a way to make the
progress reporting done by the leader more responsive so as we'd
update the index counters each time the leader is poked at with a 'P'
message by one of its workers, once a worker is done with the parallel
cleanup of one of the indexes. That's appealing, because this design
is responsive and cheap, while we'd rely on CFIs to make sure that the
leader triggers its callback on a timely basis. Unfortunately,
sticking a concept of "Parallel progress reporting" is rather
confusing here? This stuff can be used for much more purposes than
just progress reporting: the leader would execute the callback it has
registered based on the timing given by one or more of its workers,
these willing to push an event on the leader. Progress reporting is
one application of that to force a refresh and make the information of
the leader accurate. What about things like a chain of callbacks, for
example? Could the leader want to register more than one callback and
act on all of them with one single P message?

Another question I have: could the reporting of each individual worker
make sense on its own? The cleanup of the indexes depends on the
order they are processed, their number, size and AM with their cleanup
strategy, still there may be a point in getting information about how
much work a single worker is doing rather than just have the
aggregated information given to the leader?

Btw, Is an assertion really helpful here? If
parallel_progress_callback is not set, we'd just crash one line
after. It seems to me that it could be cleaner to do nothing if a
leader gets a poke message from a worker if there are no callbacks
registered.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-04-05 07:49:31 Re: Commitfest 2023-03 starting tomorrow!
Previous Message Pavel Luzanov 2023-04-05 07:42:19 Re: psql: Add role's membership options to the \du+ command