Re: Add index scan progress to pg_stat_progress_vacuum

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-12-14 05:09:46
Message-ID: 20F7ED07-4D32-4231-A81C-432EB111F0B2@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> First of all, I don't think we need to declare ParallelVacuumProgress
> in vacuum.c since it's set and used only in vacuumparallel.c. But I
> don't even think it's a good idea to declare it in vacuumparallel.c as
> a static variable. The primary reason is that it adds things we need
> to care about. For example, what if we raise an error during index
> vacuum? The transaction aborts but ParallelVacuumProgress still refers
> to something old. Suppose further that the next parallel vacuum
> doesn't launch any workers, the leader process would still end up
> accessing the old value pointed by ParallelVacuumProgress, which
> causes a SEGV. So we need to reset it anyway at the beginning of the
> parallel vacuum. It's easy to fix at this time but once the parallel
> vacuum code gets more complex, it could forget to care about it.

> IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different
> story. They are set in vacuumparallel.c and are used in vacuum.c for
> vacuum delay. If they weren't global variables, we would need to pass
> them to every function that could eventually call the vacuum delay
> function. So it makes sense to me to have them as global variables.On
> the other hand, for ParallelVacuumProgress, it's a common pattern that
> ambulkdelete(), amvacuumcleanup() or a common index scan routine like
> btvacuumscan() checks the progress. I don't think index AM needs to
> pass the value down to many of its functions. So it makes sense to me
> to pass it via IndexVacuumInfo.

Thanks for the detailed explanation and especially clearing up
my understanding of VacuumSharedCostBalance and VacuumActiveNWorker.

I do now think that passing ParallelVacuumState in IndexVacuumInfo is
a more optimal choice.

Attached version addresses the above and the previous comments.

Thanks

Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch application/octet-stream 22.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2022-12-14 05:20:22 Re: Schema variables - new implementation for Postgres 15 (typo)
Previous Message Pavel Stehule 2022-12-14 04:54:48 Re: Schema variables - new implementation for Postgres 15