Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date: 2022-02-18 14:37:05
Message-ID: CAMm1aWadyUhb89DEfgrs2j5i2u--O06x6oTp0EEhgGHtnw-SXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > Okay. I feel this can be added as additional field but it will not
> > replace backend_pid field as this represents the pid of the backend
> > which triggered the current checkpoint.
>
> I don't think that's true. Requesting a checkpoint means telling the
> checkpointer that it should wake up and start a checkpoint (or restore point)
> if it's not already doing so, so the pid will always be the checkpointer pid.
> The only exception is a standalone backend, but in that case you won't be able
> to query that view anyway.

Yes. I agree that the checkpoint will always be performed by the
checkpointer process. So the pid in the pg_stat_progress_checkpoint
view will always correspond to the checkpointer pid only. Checkpoints
get triggered in many scenarios. One of the cases is the CHECKPOINT
command issued explicitly by the backend. In this scenario I would
like to know the backend pid which triggered the checkpoint. Hence I
would like to add a backend_pid field. So the
pg_stat_progress_checkpoint view contains pid fields as well as
backend_pid fields. The backend_pid contains a valid value only during
the CHECKPOINT command issued by the backend explicitly, otherwise the
value will be 0. We may have to add an additional field to
'CheckpointerShmemStruct' to hold the backend pid. The backend
requesting the checkpoint will update its pid to this structure.
Kindly let me know if you still feel the backend_pid field is not
necessary.

> And also while looking at the patch I see there's the same problem that I
> mentioned in the previous thread, which is that the effective flags can be
> updated once the checkpoint started, and as-is the view won't reflect that. It
> also means that you can't simply display one of wal, time or force but a
> possible combination of the flags (including the one not handled in v1).

If I understand the above comment properly, it has 2 points. First is
to display the combination of flags rather than just displaying wal,
time or force - The idea behind this is to just let the user know the
reason for checkpointing. That is, the checkpoint is started because
max_wal_size is reached or checkpoint_timeout expired or explicitly
issued CHECKPOINT command. The other flags like CHECKPOINT_IMMEDIATE,
CHECKPOINT_WAIT or CHECKPOINT_FLUSH_ALL indicate how the checkpoint
has to be performed. Hence I have not included those in the view. If
it is really required, I would like to modify the code to include
other flags and display the combination. Second point is to reflect
the updated flags in the view. AFAIK, there is a possibility that the
flags get updated during the on-going checkpoint but the reason for
checkpoint (wal, time or force) will remain same for the current
checkpoint. There might be a change in how checkpoint has to be
performed if CHECKPOINT_IMMEDIATE flag is set. If we go with
displaying the combination of flags in the view, then probably we may
have to reflect this in the view.

> > Probably a new field named 'processes_wiating' or 'events_waiting' can be
> > added for this purpose.
>
> Maybe num_process_waiting?

I feel 'processes_wiating' aligns more with the naming conventions of
the fields of the existing progres views.

> > Probably writing of buffers or syncing files may complete before
> > pg_is_in_recovery() returns false. But there are some cleanup
> > operations happen as part of the checkpoint. During this scenario, we
> > may get false value for pg_is_in_recovery(). Please refer following
> > piece of code which is present in CreateRestartpoint().
> >
> > if (!RecoveryInProgress())
> > replayTLI = XLogCtl->InsertTimeLineID;
>
> Then maybe we could store the timeline rather then then kind of checkpoint?
> You should still be able to compute the information while giving a bit more
> information for the same memory usage.

Can you please describe more about how checkpoint/restartpoint can be
confirmed using the timeline id.

Thanks & Regards,
Nitin Jadhav

On Fri, Feb 18, 2022 at 1:13 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Fri, Feb 18, 2022 at 12:20:26PM +0530, Nitin Jadhav wrote:
> > >
> > > If there's a checkpoint timed triggered and then someone calls
> > > pg_start_backup() which then wait for the end of the current checkpoint
> > > (possibly after changing the flags), I think the view should reflect that in
> > > some way. Maybe storing an array of (pid, flags) is too much, but at least a
> > > counter with the number of processes actively waiting for the end of the
> > > checkpoint.
> >
> > Okay. I feel this can be added as additional field but it will not
> > replace backend_pid field as this represents the pid of the backend
> > which triggered the current checkpoint.
>
> I don't think that's true. Requesting a checkpoint means telling the
> checkpointer that it should wake up and start a checkpoint (or restore point)
> if it's not already doing so, so the pid will always be the checkpointer pid.
> The only exception is a standalone backend, but in that case you won't be able
> to query that view anyway.
>
> And also while looking at the patch I see there's the same problem that I
> mentioned in the previous thread, which is that the effective flags can be
> updated once the checkpoint started, and as-is the view won't reflect that. It
> also means that you can't simply display one of wal, time or force but a
> possible combination of the flags (including the one not handled in v1).
>
> > Probably a new field named 'processes_wiating' or 'events_waiting' can be
> > added for this purpose.
>
> Maybe num_process_waiting?
>
> > > > > > 'checkpoint or restartpoint?'
> > > > >
> > > > > Do you actually need to store that? Can't it be inferred from
> > > > > pg_is_in_recovery()?
> > > >
> > > > AFAIK we cannot use pg_is_in_recovery() to predict whether it is a
> > > > checkpoint or restartpoint because if the system exits from recovery
> > > > mode during restartpoint then any query to pg_stat_progress_checkpoint
> > > > view will return it as a checkpoint which is ideally not correct. Please
> > > > correct me if I am wrong.
> > >
> > > Recovery ends with an end-of-recovery checkpoint that has to finish before the
> > > promotion can happen, so I don't think that a restart can still be in progress
> > > if pg_is_in_recovery() returns false.
> >
> > Probably writing of buffers or syncing files may complete before
> > pg_is_in_recovery() returns false. But there are some cleanup
> > operations happen as part of the checkpoint. During this scenario, we
> > may get false value for pg_is_in_recovery(). Please refer following
> > piece of code which is present in CreateRestartpoint().
> >
> > if (!RecoveryInProgress())
> > replayTLI = XLogCtl->InsertTimeLineID;
>
> Then maybe we could store the timeline rather then then kind of checkpoint?
> You should still be able to compute the information while giving a bit more
> information for the same memory usage.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2022-02-18 14:45:49 Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Previous Message Robert Haas 2022-02-18 14:24:41 Re: adding 'zstd' as a compression algorithm