Re: Add index scan progress to pg_stat_progress_vacuum

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-11-28 23:57:14
Message-ID: 3FC1BD50-1640-45DC-A354-B6183B0966F9@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I think that indexes_total should be 0 also when INDEX_CLEANUP is off.

Patch updated for handling of INDEX_CLEANUP = off, with an update to
the documentation as well.

> I think we don't need to reset it at the end of index vacuuming. There
> is a small window before switching to the next phase. If we reset this
> value while showing "index vacuuming" phase, the user might get
> confused. Instead, we can reset it at the beginning of the index
> vacuuming.

No, I think the way it's currently done is correct. We want to reset the number
of indexes completed before we increase the num_index_scans ( index vacuum cycle ).
This ensures that when we enter a new index cycle, the number of indexes completed
are already reset. The 2 fields that matter here is how many indexes are vacuumed in the
currently cycle and this is called out in the documentation as such.

> We should add CHECK_FOR_INTERRUPTS() at the beginning of the loop to
> make the wait interruptible.

Done.

> I think it would be better to update the counter only when the value
> has been increased.

Done. Did so by checking the progress value from the beentry directly
in the function. We can do a more generalized

> I think we should set a timeout, say 1 sec, to WaitLatch so that it
> can periodically check the progress.

Done.

> Probably it's better to have a new wait event for this wait in order
> to distinguish wait for workers to complete index vacuum from the wait
> for workers to exit.

I was trying to avoid introducing a new wait event, but thinking about it,
I agree with your suggestion.

I created a new ParallelVacuumFinish wait event and documentation
for the wait event.

> I think this doesn't work, since ivinfo.idx_completed is in the
> backend-local memory. Instead, I think we can have a function in
> vacuumparallel.c that updates the progress. Then we can have index AM
> call this function.

Yeah, you're absolutely correct.

To make this work correctly, I did something similar to VacuumActiveNWorkers
and introduced an extern variable called ParallelVacuumProgress.
This variable points to pvs->shared->idx_completed_progress.

The index AMs then call parallel_vacuum_update_progress which
Is responsible for updating the progress with the current value
of ParallelVacuumProgress.

ParallelVacuumProgress is also set to NULL at the end of every index cycle.

> ivinfo.report_parallel_progress = !IsParallelWorker();

Done

Regards,

Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
v16-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch application/octet-stream 25.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-11-29 00:22:22 Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas
Previous Message Michael Paquier 2022-11-28 23:54:06 Re: Fix comment in SnapBuildFindSnapshot