Re: needless complexity in StartupXLOG

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: sfrost(at)snowman(dot)net
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: needless complexity in StartupXLOG
Date: 2021-07-27 13:17:08
Message-ID: 20210727.221708.2169466705889004531.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 26 Jul 2021 16:15:23 -0400, Stephen Frost <sfrost(at)snowman(dot)net> wrote in
> Greetings,
>
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > On Mon, Jul 26, 2021 at 1:32 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > > Yeah, tend to agree with this too ... but something I find a bit curious
> > > is the comment:
> > >
> > > * Insert a special WAL record to mark the end of
> > > * recovery, since we aren't doing a checkpoint.
> > >
> > > ... immediately after setting promoted = true, and then at the end of
> > > StartupXLOG() having:
> > >
> > > if (promoted)
> > > RequestCheckpoint(CHECKPOINT_FORCE);
> > >
> > > maybe I'm missing something, but seems like that comment isn't being
> > > terribly clear. Perhaps we aren't doing a full checkpoint *there*, but
> > > sure looks like we're going to do one moments later regardless of
> > > anything else since we've set promoted to true..?
> >
> > Yep. So it's a question of whether we allow operations that might
> > write WAL in the meantime. When we write the checkpoint record right
> > here, there can't be any WAL from the new server lifetime until the
> > checkpoint completes. When we write an end-of-recovery record, there
> > can. And there could actually be quite a bit, because if we do the
> > checkpoint right in this section of code, it will be a fast
> > checkpoint, whereas in the code you quoted above, it's a spread
> > checkpoint, which takes a lot longer. So the question is whether it's
> > reasonable to give the checkpoint some time to complete or whether it
> > needs to be completed right now.
>
> All I was really trying to point out above was that the comment might be
> improved upon, just so someone understands that we aren't doing a
> checkpoint at this particular place, but one will be done later due to
> the promotion. Maybe I'm being a bit extra with that, but that was my
> thought when reading the code and the use of the promoted flag variable.

(I feel we don't need to check for last-checkpoint, too.)

FWIW, by the way, I complained that the variable name "promoted" is a
bit confusing. It's old name was fast_promoted, which means that fast
promotion is being *requsted* and ongoing. On the other hand the
current name "promoted" still means "(fast=non-fallback) promotion is
ongoing" so there was a conversation as the follows.

https://www.postgresql.org/message-id/9fdd994d-a531-a52b-7906-e1cc22701310%40oss.nttdata.com

>> How about changing it to fallback_promotion, or some names with more
>> behavior-specific name like immediate_checkpoint_needed?
>
> I like doEndOfRecoveryCkpt or something, but I have no strong opinion
> about that flag naming. So I'm ok with immediate_checkpoint_needed
> if others also like that, too.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-07-27 13:43:54 Re: Slim down integer formatting
Previous Message Bruce Momjian 2021-07-27 12:50:02 Re: [UNVERIFIED SENDER] Re: pg_upgrade can result in early wraparound on databases with high transaction load