Re: Add index scan progress to pg_stat_progress_vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Date: 2022-03-25 14:54:25
Message-ID: CAD21AoA9Phn9uPtyq_V+J9ctNhkgChnmm_hFT2sA2qR_KcKqYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 23, 2022 at 6:57 AM Imseih (AWS), Sami <simseih(at)amazon(dot)com> wrote:
>
> > Can the leader pass a callback that checks PVIndStats to ambulkdelete
> > an amvacuumcleanup callbacks? I think that in the passed callback, the
> > leader checks if the number of processed indexes and updates its
> > progress information if the current progress needs to be updated.
>
> Thanks for the suggestion.
>
> I looked at this option a but today and found that passing the callback
> will also require signature changes to the ambulkdelete and
> amvacuumcleanup routines.

I think it would not be a critical problem since it's a new feature.

>
> This will also require us to check after x pages have been
> scanned inside vacuumscan and vacuumcleanup. After x pages
> the callback can then update the leaders progress.
> I am not sure if adding additional complexity to the scan/cleanup path
> is justified for what this patch is attempting to do.
>
> There will also be a lag of the leader updating the progress as it
> must scan x amount of pages before updating. Obviously, the more
> Pages to the scan, the longer the lag will be.

Fair points.

On the other hand, the approach of the current patch requires more
memory for progress tracking, which could fail, e.g., due to running
out of hashtable entries. I think that it would be worse that the
parallel operation failed to start due to not being able to track the
progress than the above concerns you mentioned such as introducing
additional complexity and a possible lag of progress updates. So if we
go with the current approach, I think we need to make sure enough (and
not too many) hash table entries.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-03-25 15:04:21 Re: psql - add SHOW_ALL_RESULTS option
Previous Message Robert Haas 2022-03-25 14:49:22 Re: Corruption during WAL replay