Re: please update ps display for recovery checkpoint

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: k(dot)jamison(at)fujitsu(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: please update ps display for recovery checkpoint
Date: 2020-09-19 16:00:31
Message-ID: 20200919160031.GC30557@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 10, 2020 at 01:37:10PM +0900, Michael Paquier wrote:
> On Wed, Sep 09, 2020 at 09:00:50PM -0500, Justin Pryzby wrote:
> > What would you want the checkpointer's ps to say ?
> >
> > Normally it just says:
> > postgres 3468 3151 0 Aug27 ? 00:20:57 postgres: checkpointer
>
> Note that CreateCheckPoint() can also be called from the startup
> process if the bgwriter has not been launched once recovery finishes.
>
> > Or do you mean do the same thing as now, but one layer lower, like:
> >
> > @@ -8728,6 +8725,9 @@ CreateCheckPoint(int flags)
> > + if (flags & CHECKPOINT_END_OF_RECOVERY)
> > + set_ps_display("recovery checkpoint");
>
> For the use-case discussed here, that would be fine. Now the
> difficult point is how much information we can actually display
> without bloating ps while still have something meaningful. Showing
> all the information from LogCheckpointStart() would bloat the output a
> lot for example. So, thinking about that, my take would be to have ps
> display the following at the beginning of CreateCheckpoint() and
> CreateRestartPoint():
> - restartpoint or checkpoint
> - shutdown
> - end-of-recovery
>
> The output also needs to be cleared once the routines finish or if
> there is a skip, of course.

In my inital request, I *only* care about the startup process' recovery
checkpoint. AFAIK, this exits when it's done, so there may be no need to
"revert" to the previous "ps". However, one could argue that it's currently a
bug that the "recovering NNN" portion isn't updated after finishing the WAL
files.

StartupXLOG -> xlogreader -> XLogPageRead -> WaitForWALToBecomeAvailable -> XLogFileReadAnyTLI -> XLogFileRead
-> CreateCheckPoint

Maybe it's a bad idea if the checkpointer is continuously changing its display.
I don't see the utility in it, since log_checkpoints does more than ps could
ever do. I'm concerned that would break things for someone using something
like pgrep.
|$ ps -wwf `pgrep -f 'checkpointer *$'`
|UID PID PPID C STIME TTY STAT TIME CMD
|postgres 9434 9418 0 Aug20 ? Ss 214:25 postgres: checkpointer

|pryzbyj 23010 23007 0 10:33 ? 00:00:00 postgres: checkpointer checkpoint

I think this one is by far the most common, but somewhat confusing, since it's
only one word. This led me to put parenthesis around it:

|pryzbyj 26810 26809 82 10:53 ? 00:00:12 postgres: startup (end-of-recovery checkpoint)

Related: I have always thought that this message meant "recovery will complete
Real Soon", but I now understand it to mean "beginning the recovery checkpoint,
which is flagged CHECKPOINT_IMMEDIATE" (and may take a long time).

|2020-09-19 10:53:26.345 CDT [26810] LOG: checkpoint starting: end-of-recovery immediate

--
Justin

Attachment Content-Type Size
v2-0001-Clear-PS-display-after-recovery.patch text/x-diff 724 bytes
v2-0002-Update-PS-display-following-replay-of-last-xlog.patch text/x-diff 1.8 KB
v2-0003-Avoid-ambiguous-that.patch text/x-diff 2.5 KB
v2-0004-Document-that-checkpoint_flush_after-applies-to-e.patch text/x-diff 1017 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2020-09-19 16:18:58 Re: XversionUpgrade tests broken by postfix operator removal
Previous Message Denis Gantsev 2020-09-19 14:53:24 Feature proposal for psql