Re: Background writer and checkpointer in crash recovery

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Background writer and checkpointer in crash recovery
Date: 2021-07-30 14:16:44
Message-ID: CA+Tgmoa9TsiRDQnTm20fMJjWD3W9Lm3kSS8VJJRXeFnoRw6k9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 30, 2021 at 4:42 AM Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
> v2-0001 and v2-0002 look fine, but I don't like much the idea of introducing a new GUC in v2-0003. It's for very specific needs, which most of the users, I believe, don't care about. I suggest dealing with v2-0001 and v2-0002 first and then maybe submit and discuss v2-0003 as a separate CF entry.

Hi!

Thanks for bumping this thread; I had forgotten all about this effort,
but having just spent a bunch of time struggling with the thicket of
cases in StartupXLOG(), I'm now feeling highly motivated to make some
more progress in simplifying things over there. I am still of the
opinion that 0001 is a good idea, and I don't have any suggestions for
how it could be improved, except perhaps that the call to
PublishStartupProcessInformation() could maybe have a one-line
comment. Thomas, are you planning to press forward with committing
this soon? If not, do you mind if I do?

Regarding Simon's 0002, I wonder why it's useful to print this
information out at the end of crash recovery but not at the end of
archive recovery. It seems to me that if the information is useful
enough to be worth printing, it's probably good to print it in both
cases. In fact, rather than adding a separate message for this
information, I think we should just change the existing "redo done at"
message to print the details Simon proposes rather than what it does
now. Currently, we get output like this:

2021-07-30 09:13:05.319 EDT [48380] LOG: redo starts at 0/23A6E18
2021-07-30 09:13:05.612 EDT [48380] LOG: redo done at 0/D0D9CE8
system usage: CPU: user: 0.13 s, system: 0.12 s, elapsed: 0.29 s

With Simon's patch, I get something like this:

2021-07-30 09:39:43.579 EDT [63702] LOG: redo starts at 0/14A2F48
2021-07-30 09:39:44.129 EDT [63702] LOG: redo done at 0/15F48230
system usage: CPU: user: 0.25 s, system: 0.25 s, elapsed: 0.55 s
2021-07-30 09:39:44.129 EDT [63702] LOG: crash recovery complete:
wrote 36517 buffers (222.9%); dirtied 52985 buffers; read 7 buffers

Now I really think that information on the number of buffers touched
and how long it took is way more useful than user and system time.
Knowing how much user and system time were spent doesn't really tell
you anything, but a count of buffers touched gives you some meaningful
idea of how much work recovery did, and whether I/O was slow. Elapsed
time you can figure out yourself from the timestamp. However, I don't
really like printing the percentage here; unlike the checkpoint case,
it can very easily be way more than a hundred percent, and I think
that will confuse people. It could be tens of thousands of percent,
really, or even more.

So my proposal is:

redo done at %X/%X: wrote %ld buffers (%0.3f ms); dirtied %ld buffers;
read %ld buffers (%0.3f ms)

Regarding 0003, I agree with Alexander's comment that a GUC doesn't
seem particularly appropriate, but I also think that the approach may
not be quite right. In the promotion case, we emit an end-of-recovery
record and then later in the code we trigger a checkpoint. In your
patch, there's no end-of-recovery checkpoint -- you just trigger a
checkpoint instead of waiting for it. I think it's probably better to
make those two cases work the same. The end-of-recovery record isn't
needed to change the TLI as it is in the promotion case, but (1) it
seems better to have fewer code paths and (2) it might be good for
debuggability.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-30 14:17:39 Re: Case expression pushdown
Previous Message houzj.fnst@fujitsu.com 2021-07-30 13:27:23 param 'txn' not used in function maybe_send_schema()