From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add index scan progress to pg_stat_progress_vacuum |
Date: | 2022-03-09 03:15:26 |
Message-ID: | CAD21AoDf4pf89i7fgzrHDdJueNHLm4x92xHjcDeYaV+pdhem-Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 9, 2022 at 11:35 AM Imseih (AWS), Sami <simseih(at)amazon(dot)com> wrote:
>
> > Indeed.
>
> > It might have already been discussed but other than using a new shmem
> > hash for parallel vacuum, I wonder if we can allow workers to change
> > the leader’s progress information. It would break the assumption that
> > the backend status entry is modified by its own backend, though. But
> > it might help for progress updates of other parallel operations too.
> > This essentially does the same thing as what the current patch does
> > but it doesn't require a new shmem hash.
>
> I experimented with this idea, but it did not work. The idea would have been to create a pgstat_progress_update function that takes the leader pid, however infrastructure does not exist to allow one backend to manipulate another backends backend status array.
> pgstat_fetch_stat_beentry returns a local copy only.
I think if it's a better approach we can do that including adding a
new infrastructure for it.
>
> > Another idea I come up with is that the parallel vacuum leader checks
> > PVIndStats.status and updates how many indexes are processed to its
> > progress information. The leader can check it and update the progress
> > information before and after index vacuuming. And possibly we can add
> > a callback to the main loop of index AM's bulkdelete and vacuumcleanup
> > so that the leader can periodically make it up-to-date.
>
> > Regards,
>
> The PVIndStats idea is also one I experimented with but it did not work. The reason being the backend checking the progress needs to do a shm_toc_lookup to access the data, but they are not prepared to do so.
What I imagined is that the leader checks how many PVIndStats.status
is PARALLEL_INDVAC_STATUS_COMPLETED and updates the result to its
progress information as indexes_processed. That way, the backend
checking the progress can see it.
>
> I have not considered the callback in the index AM's bulkdelete and vacuumcleanup, but I can imagine this is not possible since a leader could be busy vacuuming rather than updating counters, but I may be misunderstanding the suggestion.
Checking PVIndStats.status values is cheap. Probably the leader can
check it every 1GB index block, for example.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2022-03-09 03:17:06 | Re: Naming of the different stats systems / "stats collector" |
Previous Message | Imseih (AWS), Sami | 2022-03-09 02:57:02 | Re: Add index scan progress to pg_stat_progress_vacuum |