Re: Add index scan progress to pg_stat_progress_vacuum

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2021-12-29 16:44:31
Message-ID: 04D9F80C-1A7C-4C23-9156-6799D974DC72@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached is the latest revision of the patch.

In "pg_stat_progress_vacuum", introduce 2 columns:

* total_index_vacuum : This is the # of indexes that will be vacuumed. Keep in mind that if failsafe mode kicks in mid-flight to the vacuum, Postgres may choose to forgo index scans. This value will be adjusted accordingly.
* max_index_vacuum_cycle_time : The total elapsed time for a index vacuum cycle is calculated and this value will be updated to reflect the longest vacuum cycle. Until the first cycle completes, this value will be 0. The purpose of this column is to give the user an idea of how long an index vacuum cycle takes to complete.

postgres=# \d pg_stat_progress_vacuum
View "pg_catalog.pg_stat_progress_vacuum"
Column | Type | Collation | Nullable | Default
-----------------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
phase | text | | |
heap_blks_total | bigint | | |
heap_blks_scanned | bigint | | |
heap_blks_vacuumed | bigint | | |
index_vacuum_count | bigint | | |
max_dead_tuples | bigint | | |
num_dead_tuples | bigint | | |
total_index_vacuum | bigint | | |
max_index_vacuum_cycle_time | bigint | | |

Introduce a new view called "pg_stat_progress_vacuum_index". This view will track the progress of a worker ( or leader PID ) while it's vacuuming an index. It will expose some key columns:

* pid: The PID of the worker process

* leader_pid: The PID of the leader process. This is the column that can be joined with "pg_stat_progress_vacuum". leader_pid and pid can have the same value as a leader can also perform an index vacuum.

* indrelid: The relid of the index currently being vacuumed

* vacuum_cycle_ordinal_position: The processing position of the index being vacuumed. This can be useful to determine how many indexes out of the total indexes ( pg_stat_progress_vacuum.total_index_vacuum ) have been vacuumed

* index_tuples_vacuumed: This is the number of index tuples vacuumed for the index overall. This is useful to show that the vacuum is actually doing work, as the # of tuples keeps increasing.

postgres=# \d pg_stat_progress_vacuum_index
View "pg_catalog.pg_stat_progress_vacuum_index"
Column | Type | Collation | Nullable | Default
-------------------------------+---------+-----------+----------+---------
pid | integer | | |
leader_pid | bigint | | |
indrelid | bigint | | |
vacuum_cycle_ordinal_position | bigint | | |
index_tuples_vacuumed | bigint | | |

On 12/27/21, 6:12 PM, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> wrote:

I do agree that tracking progress by # of blocks scanned is not deterministic for all index types.

Based on this feedback, I went back to the drawing board on this.

Something like below may make more sense.

In pg_stat_progress_vacuum, introduce 2 new columns:

1. total_index_vacuum - total # of indexes to vacuum
2. max_cycle_time - the time in seconds of the longest index cycle.

Introduce another view called pg_stat_progress_vacuum_index_cycle:

postgres=# \d pg_stat_progress_vacuum_index_cycle
View "public.pg_stat_progress_vacuum_worker"
Column | Type | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
pid | integer | | | <<<-- the PID of the vacuum worker ( or leader if it's doing index vacuuming )
leader_pid | bigint | | | <<<-- the leader PID to allow this view to be joined back to pg_stat_progress_vacuum
indrelid | bigint | | | <<<- the index relid of the index being vacuumed
ordinal_position | bigint | | | <<<- the processing position, which will give an idea of the processing position of the index being vacuumed.
dead_tuples_removed | bigint | | <<<- the number of dead rows removed in the current cycle for the index.

Having this information, one can

1. Determine which index is being vacuumed. For monitoring tools, this can help identify the index that accounts for most of the index vacuuming time.
2. Having the processing order of the current index will allow the user to determine how many of the total indexes has been completed in the current cycle.
3. dead_tuples_removed will show progress on the index vacuum in the current cycle.
4. the max_cycle_time will give an idea on how long the longest index cycle took for the current vacuum operation.

On 12/23/21, 2:46 AM, "Masahiko Sawada" <sawada(dot)mshk(at)gmail(dot)com> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

On Tue, Dec 21, 2021 at 3:37 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Wed, Dec 15, 2021 at 2:10 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
> > nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
> > is more analogous to heap_blks_vacuumed.
>
> +1.
>
> > This will tell us which indexes are currently being vacuumed and the
> > current progress of those operations, but it doesn't tell us which
> > indexes have already been vacuumed or which ones are pending vacuum.
>
> VACUUM will process a table's indexes in pg_class OID order (outside
> of parallel VACUUM, I suppose). See comments about sort order above
> RelationGetIndexList().

Right.

>
> Anyway, it might be useful to add ordinal numbers to each index, that
> line up with this processing/OID order. It would also be reasonable to
> display the same number in log_autovacuum* (and VACUUM VERBOSE)
> per-index output, to reinforce the idea. Note that we don't
> necessarily display a distinct line for each distinct index in this
> log output, which is why including the ordinal number there makes
> sense.

An alternative idea would be to show the number of indexes on the
table and the number of indexes that have been processed in the
leader's entry of pg_stat_progress_vacuum. Even in parallel vacuum
cases, since we have index vacuum status for each index it would not
be hard for the leader process to count how many indexes have been
processed.

Regarding the details of the progress of index vacuum, I'm not sure
this progress information can fit for pg_stat_progress_vacuum. As
Peter already mentioned, the behavior quite varies depending on index
AM.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
0001-Add-index-scan-progress-to-pg_stat_progress_vacuum.patch application/octet-stream 16.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sadhuprasad Patro 2021-12-29 17:08:11 Per-table storage parameters for TableAM/IndexAM extensions
Previous Message Justin Pryzby 2021-12-29 16:30:15 Re: PublicationActions - use bit flags.