Re: Add index scan progress to pg_stat_progress_vacuum

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "Peter Geoghegan" <pg(at)bowt(dot)ie>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-02-23 18:02:08
Message-ID: 28F7536A-3E62-4A6B-9705-A2377A2D03AE@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_total</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The number of indexes to be processed in the
+ <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase
+ of the vacuum.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_processed</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The number of indexes processed in the
+ <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase.
+ At the start of an index vacuum cycle, this value is set to <literal>0</literal>.
+ </para></entry>
+ </row>

> Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
> turned off?

If the failsafe kicks in midway through a vacuum, the number indexes_total will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will be 0 at the start of the vacuum.

+typedef struct VacWorkerProgressInfo
+{
+ int num_vacuums; /* number of active VACUUMS with parallel workers */
+ int max_vacuums; /* max number of VACUUMS with parallel workers */
+ slock_t mutex;
+ VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
+} VacWorkerProgressInfo;

> max_vacuums appears to just be a local copy of MaxBackends. Does this
> information really need to be stored here? Also, is there a strong reason
> for using a spinlock instead of an LWLock?

First, The BTVacInfo code in backend/access/nbtree/nbtutils.c inspired this, so I wanted to follow this pattern. With that said, I do see max_vacuums being redundant here, and I am inclined to replace it with a MaxBackends() call.

Second, There is no strong reason to use spinlock here except I incorrectly assumed it will be better for this case. After reading more about this and reading up src/backend/storage/lmgr/README, an LWLock will be better.

+void
+vacuum_worker_end(int leader_pid)
+{
+ SpinLockAcquire(&vacworkerprogress->mutex);
+ for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
+ {
+ VacOneWorkerProgressInfo *vac = &vacworkerprogress->vacuums[i];
+
+ if (vac->leader_pid == leader_pid)
+ {
+ *vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1];
+ vacworkerprogress->num_vacuums--;
+ SpinLockRelease(&vacworkerprogress->mutex);
+ break;
+ }
+ }
+ SpinLockRelease(&vacworkerprogress->mutex);
+}

> I see this loop pattern in a couple of places, and it makes me wonder if
> this information would fit more naturally in a hash table.

Followed the pattern in backend/access/nbtree/nbtutils.c for this as well. Using dynahash may make sense here if it simplifies the code. Will look.

+ if (callback)
+ callback(values, 3);

> Why does this need to be set up as a callback function? Could we just call
> the function if cmdtype == PROGRESS_COMMAND_VACUUM? ISTM that is pretty
> much all this is doing.

The intention will be for the caller to set the callback early on in the function using the existing " if (pg_strcasecmp(cmd, "VACUUM") == 0), etc." statement. This way we avoid having to add another if/else block before tuplestore_putvalues is called.

--
Sami Imseih
Amazon Web Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2022-02-23 18:38:08 [PATCH] Enable SSL library detection via PQsslAttribute
Previous Message Nathan Bossart 2022-02-23 17:44:58 Re: Allow file inclusion in pg_hba and pg_ident files