Re: Use of backup_label not noted in log

From: Andres Freund <andres(at)anarazel(dot)de>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: Use of backup_label not noted in log
Date: 2023-11-20 19:03:28
Message-ID: 20231120190328.kggv7h24auenqg5g@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-11-20 11:35:15 +0100, Laurenz Albe wrote:
> On Mon, 2023-11-20 at 17:30 +0900, Michael Paquier wrote:
> > + if (ControlFile->backupStartPoint != InvalidXLogRecPtr)
> > + ereport(LOG,
> > + (errmsg("continuing to start from base backup with redo LSN %X/%X",
> > + LSN_FORMAT_ARGS(ControlFile->backupStartPoint))));
> >
> > "Continuing to start" sounds a bit weird to me, though, considering
> > that there are a few LOGs that say "starting" when there is a signal
> > file, but I don't have a better idea on top of my mind. So that
> > sounds OK here.
>
> We can only reach that message in recovery or standby mode, right?
> So why not write "continuing to recover from base backup"?

It can be reached without either too, albeit much less commonly.

> If we add a message for starting with "backup_label", shouldn't
> we also add a corresponding message for starting from a checkpoint
> found in the control file? If you see that in a problem report,
> you immediately know what is going on.

Maybe - the reason I hesitate on that is that emitting an additional log
message when starting from a base backup just adds something "once once the
lifetime of a node". Whereas emitting something every start obviously doesn't
impose any limit.

You also can extrapolate from the messages absence that we started up without
backup_label, it's not like there would be a lot of messages inbetween
"database system was interrupted; last ..."
and
"starting backup recovery ..."
(commonly there would be no messages)

We can do more on HEAD of course, but we should be wary of just spamming the
log unnecessarily as well.

I guess we could add this message at the same time, including in the back
branches. Initially I thought that might be unwise, because replacing
elog(DEBUG1, "end of backup reached");
with a different message could theoretically cause issues, even if unlikely,
given that it's a DEBUG1 message.

But I think we actually want to emit the message a bit later, just *after* we
updated the control file, as that's the actually relevant piece after which we
won't go back to the "backup recovery" state. I am somewhat agnostic about
whether we should add that in the back branches or not.

Here's the state with my updated patch, when starting up from a base backup:

LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc-14.0.0, 64-bit
LOG: listening on IPv6 address "::1", port 5441
LOG: listening on IPv4 address "127.0.0.1", port 5441
LOG: listening on Unix socket "/tmp/.s.PGSQL.5441"
LOG: database system was interrupted; last known up at 2023-11-20 10:55:49 PST
LOG: starting recovery from base backup with redo LSN E/AFF07F20, checkpoint LSN E/B01B17F0, on timeline ID 1
LOG: entering standby mode
LOG: redo starts at E/AFF07F20
LOG: completed recovery from base backup with redo LSN E/AFF07F20
LOG: consistent recovery state reached at E/B420FC80

Besides the phrasing and the additional log message (I have no opinion about
whether it should be backpatched or not), I used %u for TimelineID as
appropriate, and added a comma before "on timeline".

Greetings,

Andres Freund

Attachment Content-Type Size
v2-0001-WIP-Log-when-starting-up-from-a-base-backup.patch text/x-diff 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2023-11-20 19:08:15 Re: Use of backup_label not noted in log
Previous Message Alvaro Herrera 2023-11-20 19:03:01 Re: trying again to get incremental backup