Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date: 2022-02-15 12:15:26
Message-ID: CAMm1aWZYY=cHSJChN7rJ-SbbJ1phsEYNP4YybiG84a+ta2k87w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Apart from above fields, I am planning to add few more fields to the
> view in the next patch. That is, process ID of the backend process
> which triggered a CHECKPOINT command, checkpoint start location, filed
> to indicate whether it is a checkpoint or restartpoint and elapsed
> time of the checkpoint operation. Please share your thoughts. I would
> be happy to add any other information that contributes to showing the
> progress of checkpoint.

The progress reporting mechanism of postgres uses the
'st_progress_param' array of 'PgBackendStatus' structure to hold the
information related to the progress. There is a function
'pgstat_progress_update_param()' which takes 'index' and 'val' as
arguments and updates the 'val' to corresponding 'index' in the
'st_progress_param' array. This mechanism works fine when all the
progress information is of type integer as the data type of
'st_progress_param' is of type integer. If the progress data is of
different type than integer, then there is no easy way to do so. In my
understanding, define a new structure with additional fields. Add this
as part of the 'PgBackendStatus' structure and support the necessary
function to update and fetch the data from this structure. This
becomes very ugly as it will not match the existing mechanism of
progress reporting. Kindly let me know if there is any better way to
handle this. If there are any changes to the existing mechanism to
make it generic to support basic data types, I would like to discuss
this in the new thread.

On Thu, Feb 10, 2022 at 12:22 PM Nitin Jadhav
<nitinjadhavpostgres(at)gmail(dot)com> wrote:
>
> > > We need at least a trace of the number of buffers to sync (num_to_scan) before the checkpoint start, instead of just emitting the stats at the end.
> > >
> > > Bharat, it would be good to show the buffers synced counter and the total buffers to sync, checkpointer pid, substep it is running, whether it is on target for completion, checkpoint_Reason
> > > (manual/times/forced). BufferSync has several variables tracking the sync progress locally, and we may need some refactoring here.
> >
> > I agree to provide above mentioned information as part of showing the
> > progress of current checkpoint operation. I am currently looking into
> > the code to know if any other information can be added.
>
> Here is the initial patch to show the progress of checkpoint through
> pg_stat_progress_checkpoint view. Please find the attachment.
>
> The information added to this view are pid - process ID of a
> CHECKPOINTER process, kind - kind of checkpoint indicates the reason
> for checkpoint (values can be wal, time or force), phase - indicates
> the current phase of checkpoint operation, total_buffer_writes - total
> number of buffers to be written, buffers_processed - number of buffers
> processed, buffers_written - number of buffers written,
> total_file_syncs - total number of files to be synced, files_synced -
> number of files synced.
>
> There are many operations happen as part of checkpoint. For each of
> the operation I am updating the phase field of
> pg_stat_progress_checkpoint view. The values supported for this field
> are initializing, checkpointing replication slots, checkpointing
> snapshots, checkpointing logical rewrite mappings, checkpointing CLOG
> pages, checkpointing CommitTs pages, checkpointing SUBTRANS pages,
> checkpointing MULTIXACT pages, checkpointing SLRU pages, checkpointing
> buffers, performing sync requests, performing two phase checkpoint,
> recycling old XLOG files and Finalizing. In case of checkpointing
> buffers phase, the fields total_buffer_writes, buffers_processed and
> buffers_written shows the detailed progress of writing buffers. In
> case of performing sync requests phase, the fields total_file_syncs
> and files_synced shows the detailed progress of syncing files. In
> other phases, only the phase field is getting updated and it is
> difficult to show the progress because we do not get the total number
> of files count without traversing the directory. It is not worth to
> calculate that as it affects the performance of the checkpoint. I also
> gave a thought to just mention the number of files processed, but this
> wont give a meaningful progress information (It can be treated as
> statistics). Hence just updating the phase field in those scenarios.
>
> Apart from above fields, I am planning to add few more fields to the
> view in the next patch. That is, process ID of the backend process
> which triggered a CHECKPOINT command, checkpoint start location, filed
> to indicate whether it is a checkpoint or restartpoint and elapsed
> time of the checkpoint operation. Please share your thoughts. I would
> be happy to add any other information that contributes to showing the
> progress of checkpoint.
>
> As per the discussion in this thread, there should be some mechanism
> to show the progress of checkpoint during shutdown and end-of-recovery
> cases as we cannot access pg_stat_progress_checkpoint in those cases.
> I am working on this to use log_startup_progress_interval mechanism to
> log the progress in the server logs.
>
> Kindly review the patch and share your thoughts.
>
>
> On Fri, Jan 28, 2022 at 12:24 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Fri, Jan 21, 2022 at 11:07 AM Nitin Jadhav
> > <nitinjadhavpostgres(at)gmail(dot)com> wrote:
> > >
> > > > I think the right choice to solve the *general* problem is the
> > > > mentioned pg_stat_progress_checkpoints.
> > > >
> > > > We may want to *additionally* have the ability to log the progress
> > > > specifically for the special cases when we're not able to use that
> > > > view. And in those case, we can perhaps just use the existing
> > > > log_startup_progress_interval parameter for this as well -- at least
> > > > for the startup checkpoint.
> > >
> > > +1
> > >
> > > > We need at least a trace of the number of buffers to sync (num_to_scan) before the checkpoint start, instead of just emitting the stats at the end.
> > > >
> > > > Bharat, it would be good to show the buffers synced counter and the total buffers to sync, checkpointer pid, substep it is running, whether it is on target for completion, checkpoint_Reason
> > > > (manual/times/forced). BufferSync has several variables tracking the sync progress locally, and we may need some refactoring here.
> > >
> > > I agree to provide above mentioned information as part of showing the
> > > progress of current checkpoint operation. I am currently looking into
> > > the code to know if any other information can be added.
> >
> > As suggested in the other thread by Julien, I'm changing the subject
> > of this thread to reflect the discussion.
> >
> > Regards,
> > Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-02-15 12:17:34 [PATCH] Fix out-of-bouds access (src/common/wchar.c)
Previous Message Böszörményi Zoltán 2022-02-15 12:12:23 Re: [PATCH] Add UPDATE WHERE OFFSET IN clause