Re: BUG #17744: Fail Assert while recoverying from pg_basebackup

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, andres(at)anarazel(dot)de, zxwsbg12138(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17744: Fail Assert while recoverying from pg_basebackup
Date: 2023-02-24 00:36:50
Message-ID: Y/gGotKY2V4/pEJv@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Feb 22, 2023 at 11:27:03PM +1300, Thomas Munro wrote:
> I started by wanting to have the bgwriter active. For that, the
> checkpointer must also be active, to have a shared fsync request queue
> (since both the startup process and the bgwriter create fsync
> requests, it wouldn't work if the startup process had a local
> "pendingOps"). Looking into this.

I was thinking about that, and you may be fine as long as you skip
some parts of the restartpoint logic. The case reported of this
thread does not cause crash recovery, actually, because startup
switches to +archive+ recovery any time it sees a backup_label file.
One thing I did not remember here is that we also set minRecoveryPoint
at a much earlier LSN than it should be (see 6c4f666). However, we
rely heavily on backupEndRequired in the control file to make sure
that we've replayed up the end-of-backup record to decide if the
system is consistent or not.

This is the type of checks done by CheckRecoveryConsistency() for the
startup process. And what we want here is to be able to apply the
same kind of things to the checkpointer, as the checkpointer needs to
decide which part of a restartpoint is fine to be run without
consistency being reached yet.

The case of the OP actually switches InArchiveRecovery to true, hence
enforcing SharedRecoveryState = RECOVERY_STATE_ARCHIVE from the start
of the redo loop, so you cannot rely only on that in the
checkpointer as I initially thought. Neither can you just rely only
on minRecoveryPoint as we could expect an end-of-backup record. So,
you need to check for a combination of SharedRecoveryState,
minRecoveryPoint and endBackupRecord, roughly. Note that as
ReachedEndOfBackup() tells, minRecoveryPoint could be as well ahead of
the end-of-backup record, meaning that we'd still need to replay more
than the end-of-backup record before reaching consistency.

There are a few things that we can do here that I can think of, in
conservative order:
- Delay the checkpointer startup until the startup process is OK with
a consistent state, which could be the end of a backup. No need to
let the checkpointer know when consistency has been reached, though it
basically invalidates your previous change to make the bgwriter and
the checkpointer more aggressive.
- Delay completely restart points, which is pretty much the same as
the last one, except that the checkpointer is started earlier, and it
needs to know about the consistent state.
- Skip some parts of the restartpoint logic. TruncateSUBTRANS() is
one, as proved here, still we have other parts of CheckPointGuts()
that equally rely on a consistent system? For example, I am not sure
about things like CheckPointSnapBuild(),
CheckPointLogicalRewriteHeap() or CheckPointReplicationOrigin(), to
name three. Most of these involve SLRU flushes, these days, but we'd
better be careful in the choices made.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2023-02-24 01:54:17 Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash
Previous Message Andres Freund 2023-02-23 23:36:36 Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash