Re: Add index scan progress to pg_stat_progress_vacuum

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2023-04-06 15:14:20
Message-ID: 3B4246CB-EECB-4A83-8FA1-8A4BB6970F2A@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> As one thing,
> for example, it introduces a dependency to parallel.h to do progress
> reporting without touching at backend_progress.h.

Containing the logic in backend_progress.h is a reasonable point
from a maintenance standpoint.

We can create a new function in backend_progress.h called
pgstat_progress_update_leader which is called from
vacuumparallel.c.

pgstat_progress_update_leader can then call pq_putmessage('P', NULL, 0)

> Is a callback
> approach combined with a counter in shared memory the best thing there
> could be?

It seems to be the best way.

The shared memory, ParallelVacuumState, is already tracking the
counters for the Parallel Vacuum.

Also, the callback in ParallelContext is the only way I can see
to let the 'P' message know what to do for updating progress
to the leader.

> Could it be worth thinking about a different design where
> the value incremented and the parameters of
> pgstat_progress_update_param() are passed through the 'P' message
> instead?

I am not sure how this is different than the approach suggested.
In the current design, the 'P' message is used to pass the
ParallelvacuumState to parallel_vacuum_update_progress which then
calls pgstat_progress_update_param.

> It strikes me that gathering data in the leader from a poke
> of the workers is something that could be useful in so much more areas
> than just the parallel index operations done in a vacuum because we do
> more and more things in parallel these days, so the API interface
> ought to have some attention.

We may need an interface that does more than progress
reporting, but I am not sure what those use cases are at
this point, besides progress reporting.

> There is also an argument where we could
> have each worker report their progress independently of the leader?

In this case, we don't need ParallelContext at all or to go through the
'P' message.

--
Regards,

Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-04-06 15:34:30 Re: zstd compression for pg_dump
Previous Message Karl O. Pinc 2023-04-06 15:05:52 Re: doc: add missing "id" attributes to extension packaging page