Re: BUG #17744: Fail Assert while recoverying from pg_basebackup

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, andres(at)anarazel(dot)de, zxwsbg12138(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17744: Fail Assert while recoverying from pg_basebackup
Date: 2023-02-22 10:27:03
Message-ID: CA+hUKGKnUfBQ4wEOf7Ggkm=8CoN=kVU9O=TqeqSU-p349rugDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Feb 21, 2023 at 8:01 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Tue, Feb 21, 2023 at 03:22:58PM +0900, Kyotaro Horiguchi wrote:
> > At Tue, 21 Feb 2023 12:51:51 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> >> commit 7ff23c6d277d1d90478a51f0dd81414d343f3850
> >> Author: Thomas Munro <tmunro(at)postgresql(dot)org>
> >> Date: Mon Aug 2 17:32:20 2021 +1200
> >>
> >> Run checkpointer and bgwriter in crash recovery.
> >>
> >> Start up the checkpointer and bgwriter during crash recovery (except in
> >> --single mode), as we do for replication. This wasn't done back in
> >> commit cdd46c76 out of caution. Now it seems like a better idea to make
> >> the environment as similar as possible in both cases. There may also be
> >> some performance advantages.
> >
> > Wouldn't just previnting restartpoints during crash recovery mean
> > making this change inneffective? If something wrong actually happens
> > when restartpoints happen during crash recovery, shouldn't we identify
> > the cause and let restartpoints run safely during crash recovery?
>
> I will let Thomas comment on that (now added in CC of this thread),
> because, honestly, I am not sure.

Thanks, catching up with this thread.

> Now, it seems to me that the original intent of starting the
> checkpointer this early is to be able to unify the code path for the
> end-of-recovery checkpoint, primarily, which is a fine goal in itself
> because the startup process can query a checkpoint rather than doing
> things itself. It is the fact of starting early the bgwriter that
> showed most of the improvements in crash recovery with some aggressive
> tuning setups, under specific workloads like heavy INSERTs.
>
> This report is telling us that restartpoints during crash recovery may
> not be a good idea without more careful consideration, even if the
> original intent of 7ff23c6 is to allow them in the crash recovery
> case. I am not sure about this last point. (Restartpoint is
> mentioned once in the thread of this commit, FWIW.)

I started by wanting to have the bgwriter active. For that, the
checkpointer must also be active, to have a shared fsync request queue
(since both the startup process and the bgwriter create fsync
requests, it wouldn't work if the startup process had a local
"pendingOps"). Looking into this.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Lakhin 2023-02-22 12:00:00 Re: BUG #17803: Rule "ALSO INSERT ... SELECT ..." fails to substitute default values
Previous Message Richard Guo 2023-02-22 10:24:11 Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)