Re: Add index scan progress to pg_stat_progress_vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2023-02-16 08:27:43
Message-ID: CAD21AoCPD_iM7mr3h98B1q=ZZy6UW4A32tbUhQT0+7D6r_FR7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 8, 2023 at 11:03 AM Imseih (AWS), Sami <simseih(at)amazon(dot)com> wrote:
>
> Hi,
>
> Thanks for your reply!
>
> I addressed the latest comments in v23.
>
> 1/ cleaned up the asserts as discussed.
> 2/ used pq_putmessage to send the message on index scan completion.

Thank you for updating the patch! Here are some comments for v23 patch:

+ <row>
+ <entry><literal>ParallelVacuumFinish</literal></entry>
+ <entry>Waiting for parallel vacuum workers to finish index
vacuum.</entry>
+ </row>

This change is out-of-date.

---
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_total</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of indexes that will be vacuumed or cleaned up. This
value will be
+ <literal>0</literal> if the phase is not <literal>vacuuming
indexes</literal>
+ or <literal>cleaning up indexes</literal>,
<literal>INDEX_CLEANUP</literal>
+ is set to <literal>OFF</literal>, index vacuum is skipped due to very
+ few dead tuples in the table, or vacuum failsafe is triggered.
+ See <xref linkend="guc-vacuum-failsafe-age"/>
+ for more on vacuum failsafe.
+ </para></entry>
+ </row>

This explanation looks redundant: setting INDEX_CLEANUP to OFF
essentially means the vacuum doesn't enter the vacuuming indexes
phase. The same is true for the case of skipping index vacuum and
failsafe mode. How about the following?

Total number of indexes that will be vacuumed or cleaned up. This
number is reported as of the beginning of the vacuuming indexes phase
or the cleaning up indexes phase.

---
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_completed</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of indexes vacuumed in the current vacuum cycle when the
+ phase is <literal>vacuuming indexes</literal>, or the number
+ of indexes cleaned up during the <literal>cleaning up indexes</literal>
+ phase.
+ </para></entry>
+ </row>

How about simplfyy the description to something like:

Number of indexes processed. This counter only advances when the phase
is vacuuming indexes or cleaning up indexes.

Also, index_processed sounds accurate to me. What do you think?

---
+ pcxt->parallel_progress_callback = NULL;
+ pcxt->parallel_progress_callback_arg = NULL;

I think these settings are not necessary since the pcxt is palloc0'ed.

---
+void
+parallel_vacuum_update_progress(void *arg)
+{
+ ParallelVacuumState *pvs = (ParallelVacuumState *)arg;
+
+ Assert(!IsParallelWorker());
+ Assert(pvs->pcxt->parallel_progress_callback_arg);
+
+ if (pvs)
+ pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+
pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1));
+}

Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me.
If 'arg' is NULL, a SEGV happens.

I think it's better to update pvs->shared->nindexes_completed by both
leader and worker processes who processed the index.

---
+/* progress callback definition */
+typedef void (*ParallelProgressCallback) (void
*parallel_progress_callback_state);
+
typedef void (*parallel_worker_main_type) (dsm_segment *seg, shm_toc *toc);

I think it's better to make the function type consistent with the
existing parallel_worker_main_type. How about
parallel_progress_callback_type?

I've attached a patch that incorporates the above comments and has
some suggestions of updating comments etc.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
fix_v23_masahiko.patch application/octet-stream 14.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-02-16 08:28:40 Re: Support logical replication of global object commands
Previous Message shiy.fnst@fujitsu.com 2023-02-16 08:26:07 RE: run pgindent on a regular basis / scripted manner