Re: Possible missing segments in archiving on standby

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Possible missing segments in archiving on standby
Date: 2021-09-08 07:01:22
Message-ID: abdf9b3a-7782-7119-50f9-223146765c3a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/09/08 10:45, Kyotaro Horiguchi wrote:
> Anyway there's no guarantee on the archive ordering. As discussed in
> the nearby thread [1], newer segment is often archived earlier. I
> agree that that happens mainly on busy servers, though. The archiver
> is designed to handle such "gaps" and/or out-of-order segment
> notifications. We could impose a strict ordering on archiving but I
> think we would take total performance than such strictness.

Yes, there are other cases causing newer WAL file to be archived eariler.
The issue can happen if XLogArchiveNotify() fails to create .ready file,
for example. Fixing only the case that we're discussing here is not enough.
If *general* fix is discussed at the thread you told, it's better to
do nothing here for the issue and to just make the startup process call
XLogArchiveCheckDone() if it finds the WAL file including XLOG_SWITCH record.

> At least currently, recovery fetches segments by a single process and
> every file is marked immediately after being filled-up, so all files
> other than the latest one in pg_wal including history files should
> have been marked for sure unless file system gets into a trouble.

You can reproduce that situation easily by starting the server with
archive_mode=off, generating WAL files, sometimes running pg_switch_wal(),
causing the server to crash, and then restarting the server with
archive_mode=on. At the beginning of recovery, all the WAL files in pg_wal
don't have their archive notification files at all. Then, with the patch,
only WAL files including XLOG_SWITCH are notified for WAL archiving
during recovery. The other WAL files will be notified at the subsequent
checkpoint.

> I'm not sure I like that XLogWalRcvClose hides the segment-switch
> condition. If we do that check in the function, I'd like to name the
> function XLogWalRcvCloseIfSwitched or something indicates the
> condition. I'd like to invert the condition to reduce indentation,
> too.

We can move the condition-check out of the function like the attached patch.

> Why don't we call it just after writing data, as my first proposal
> did? There's no difference in functionality between doing that and the
> patch. If we do so, recvFile>=0 is always true and that condition can
> be removed but that would be optional. Anyway, by doing that, no
> longer need to call the function twice or we can even live without the
> new function.

I think that it's better and more robust to confirm that the currently-opened
WAL file is valid target one to write WAL *before* actually writing any data
into it.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
walreceiver_notify_archive_soon_v4.patch text/plain 4.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-09-08 07:03:46 Re: prevent immature WAL streaming
Previous Message Michael Paquier 2021-09-08 06:58:57 Re: Add jsonlog log_destination for JSON server logs