Background writer and checkpointer in crash recovery

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>
Subject: Background writer and checkpointer in crash recovery
Date: 2020-08-30 00:12:15
Message-ID: CA+hUKGJ8NRsqgkZEnsnRc2MFROBV-jCnacbYvtpptK2A9YYp9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(Forking this thread from the SLRU fsync one[1] to allow for a
separate CF entry; it's unrelated, except for being another case of
moving work off the recovery process's plate.)

Hello hackers,

Currently we don't run the bgwriter process during crash recovery.
I've CCed Simon and Heikki who established this in commit cdd46c76.
Based on that commit message, I think the bar to clear to change the
policy is to show that it's useful, and that it doesn't make crash
recovery less robust. See the other thread for some initial evidence
of usefulness from Jakub Wartak. I think it also just makes intuitive
sense that it's got to help bigger-than-buffer-pool crash recovery if
you can shift buffer eviction out of the recovery loop. As for
robustness, I suppose we could provide the option to turn it off just
in case that turns out to be useful in some scenarios, but I'm
wondering why we would expect something that we routinely run in
archive/streaming recovery to reduce robustness in only slightly
different circumstances.

Here's an experiment-grade patch, comments welcome, though at this
stage it's primarily thoughts about the concept that I'm hoping to
solicit.

One question raised by Jakub that I don't have a great answer to right
now is whether you'd want different bgwriter settings in this scenario
for best results. I don't know, but I suspect that the answer is to
make bgwriter more adaptive rather than more configurable, and that's
an orthogonal project.

Once we had the checkpointer running, we could also consider making
the end-of-recovery checkpoint optional, or at least the wait for it
to complete. I haven't shown that in this patch, but it's just
different checkpoint request flags and possibly an end-of-recovery
record. What problems do you foresee with that? Why should we have
"fast" promotion but not "fast" crash recovery?

[1] https://www.postgresql.org/message-id/flat/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ(at)mail(dot)gmail(dot)com

Attachment Content-Type Size
0001-Run-checkpointer-and-bgworker-in-crash-recovery.patch text/x-patch 8.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-08-30 00:26:20 Re: Disk-based hash aggregate's cost model
Previous Message Tom Lane 2020-08-30 00:06:18 Re: PROC_IN_ANALYZE stillborn 13 years ago