Re: Background writer and checkpointer in crash recovery

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>
Subject: Re: Background writer and checkpointer in crash recovery
Date: 2021-02-02 22:11:48
Message-ID: CA+TgmobTw_skjNY4Pq1EF_jhEJFupJmSWGqkxTq79K=6eb2dqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 29, 2020 at 8:13 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Currently we don't run the bgwriter process during crash recovery.
> I've CCed Simon and Heikki who established this in commit cdd46c76.
> Based on that commit message, I think the bar to clear to change the
> policy is to show that it's useful, and that it doesn't make crash
> recovery less robust. See the other thread for some initial evidence
> of usefulness from Jakub Wartak. I think it also just makes intuitive
> sense that it's got to help bigger-than-buffer-pool crash recovery if
> you can shift buffer eviction out of the recovery loop. As for
> robustness, I suppose we could provide the option to turn it off just
> in case that turns out to be useful in some scenarios, but I'm
> wondering why we would expect something that we routinely run in
> archive/streaming recovery to reduce robustness in only slightly
> different circumstances.
>
> Here's an experiment-grade patch, comments welcome, though at this
> stage it's primarily thoughts about the concept that I'm hoping to
> solicit.

I think the way it works right now is stupid and the proposed change
is going in the right direction. We have ample evidence already that
handing off fsyncs to a background process is a good idea, and there's
no reason why that shouldn't be beneficial during crash recovery just
as it is at other times. But even if it somehow failed to improve
performance during recovery, there's another good reason to do this,
which is that it would make the code simpler. Having the pendingOps
stuff in the startup process in some recovery situations and in the
checkpointer in other recovery situations makes this harder to reason
about. As Tom said, the system state where bgwriter and checkpointer
are not running is an uncommon one, and is probably more likely to
have (or grow) bugs than the state where they are running.

The rat's-nest of logic introduced by the comment "Perform a
checkpoint to update all our recovery activity to disk." inside
StartupXLOG() could really do with some simplification. Right now we
have three cases: CreateEndOfRecoveryRecord(), RequestCheckpoint(),
and CreateCheckpoint(). Maybe with this change we could get it down to
just two, since RequestCheckpoint() already knows what to do about
!IsUnderPostmaster.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-02-02 22:12:55 Re: Recording foreign key relationships for the system catalogs
Previous Message John Naylor 2021-02-02 21:42:31 Re: Perform COPY FROM encoding conversions in larger chunks