Re: BUG #17744: Fail Assert while recoverying from pg_basebackup

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, andres(at)anarazel(dot)de, zxwsbg12138(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17744: Fail Assert while recoverying from pg_basebackup
Date: 2023-02-24 19:56:18
Message-ID: CA+hUKGKVjfmdBd3je31o1_yW9j=4DRaZDdZUAnTZjqAbirCurA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Feb 25, 2023 at 12:02 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Fri, Feb 24, 2023 at 09:36:50AM +0900, Michael Paquier wrote:
> > I was thinking about that, and you may be fine as long as you skip
> > some parts of the restartpoint logic. The case reported of this
> > thread does not cause crash recovery, actually, because startup
> > switches to +archive+ recovery any time it sees a backup_label file.
> > One thing I did not remember here is that we also set minRecoveryPoint
> > at a much earlier LSN than it should be (see 6c4f666). However, we
> > rely heavily on backupEndRequired in the control file to make sure
> > that we've replayed up the end-of-backup record to decide if the
> > system is consistent or not.
>
> I have been spending more time on that to see if I was missing
> something, and reproducing the issue is rather easy by using pgbench
> that gets stopped with a SIGINT so as restart points would be able to
> see transactions still running in the code path triggering the assert.
> A cheap regression test should be possible, actually, though for now
> the only thing I have been able to rely on is a hack to force
> checkpoint_timeout at 1s to make the failure rate more aggressive.
>
> Anyway, with this simple method (and an increase of short pgbench runs
> that are interrupted to increase the chance of hits), a bisect points
> at 7ff23c6 :/

Thanks. I've been thinking about how to make a deterministic test
script to study this and possible fixes, too. Unfortunately I came
down with a nasty cold and stopped computing for a couple of days, so
sorry for the slow response on this thread, but I seem to have
rebooted now. Looking.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-02-24 22:12:43 Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash
Previous Message Wesley Smith 2023-02-24 19:36:28 Re: BUG #17806: PostgreSQL 13.10 returns "CREATE DATABASE cannot be executed within a pipeline"