Re: BUG #17744: Fail Assert while recoverying from pg_basebackup

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, andres(at)anarazel(dot)de, zxwsbg12138(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17744: Fail Assert while recoverying from pg_basebackup
Date: 2023-02-24 11:01:54
Message-ID: Y/iZIh/NdsYqazmZ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Feb 24, 2023 at 09:36:50AM +0900, Michael Paquier wrote:
> I was thinking about that, and you may be fine as long as you skip
> some parts of the restartpoint logic. The case reported of this
> thread does not cause crash recovery, actually, because startup
> switches to +archive+ recovery any time it sees a backup_label file.
> One thing I did not remember here is that we also set minRecoveryPoint
> at a much earlier LSN than it should be (see 6c4f666). However, we
> rely heavily on backupEndRequired in the control file to make sure
> that we've replayed up the end-of-backup record to decide if the
> system is consistent or not.

I have been spending more time on that to see if I was missing
something, and reproducing the issue is rather easy by using pgbench
that gets stopped with a SIGINT so as restart points would be able to
see transactions still running in the code path triggering the assert.
A cheap regression test should be possible, actually, though for now
the only thing I have been able to rely on is a hack to force
checkpoint_timeout at 1s to make the failure rate more aggressive.

Anyway, with this simple method (and an increase of short pgbench runs
that are interrupted to increase the chance of hits), a bisect points
at 7ff23c6 :/
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-02-24 15:56:34 Re: BUG #17806: PostgreSQL 13.10 returns "CREATE DATABASE cannot be executed within a pipeline"
Previous Message Dean Rasheed 2023-02-24 08:50:45 Re: BUG #17803: Rule "ALSO INSERT ... SELECT ..." fails to substitute default values