Re: Add index scan progress to pg_stat_progress_vacuum

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-02-21 19:09:10
Message-ID: 20220221190910.GA3740885@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 21, 2022 at 07:03:39PM +0000, Imseih (AWS), Sami wrote:
> Sending again with patch files renamed to ensure correct apply order.

I haven't had a chance to test this too much, but I did look through the
patch set and have a couple of small comments.

+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_total</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The number of indexes to be processed in the
+ <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase
+ of the vacuum.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_processed</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The number of indexes processed in the
+ <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase.
+ At the start of an index vacuum cycle, this value is set to <literal>0</literal>.
+ </para></entry>
+ </row>

Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
turned off?

+typedef struct VacWorkerProgressInfo
+{
+ int num_vacuums; /* number of active VACUUMS with parallel workers */
+ int max_vacuums; /* max number of VACUUMS with parallel workers */
+ slock_t mutex;
+ VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
+} VacWorkerProgressInfo;

max_vacuums appears to just be a local copy of MaxBackends. Does this
information really need to be stored here? Also, is there a strong reason
for using a spinlock instead of an LWLock?

+void
+vacuum_worker_end(int leader_pid)
+{
+ SpinLockAcquire(&vacworkerprogress->mutex);
+ for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
+ {
+ VacOneWorkerProgressInfo *vac = &vacworkerprogress->vacuums[i];
+
+ if (vac->leader_pid == leader_pid)
+ {
+ *vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1];
+ vacworkerprogress->num_vacuums--;
+ SpinLockRelease(&vacworkerprogress->mutex);
+ break;
+ }
+ }
+ SpinLockRelease(&vacworkerprogress->mutex);
+}

I see this loop pattern in a couple of places, and it makes me wonder if
this information would fit more naturally in a hash table.

+ if (callback)
+ callback(values, 3);

Why does this need to be set up as a callback function? Could we just call
the function if cmdtype == PROGRESS_COMMAND_VACUUM? ISTM that is pretty
much all this is doing.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-02-21 19:12:04 Re: Emit a warning if the extension's GUC is set incorrectly
Previous Message Imseih (AWS), Sami 2022-02-21 19:03:39 Re: Add index scan progress to pg_stat_progress_vacuum