Re: Possible missing segments in archiving on standby

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Possible missing segments in archiving on standby
Date: 2021-08-30 16:54:36
Message-ID: a8eac442-9a85-7d63-a1c3-d34ee81ed87f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/06/30 16:55, Kyotaro Horiguchi wrote:
> Hello.
>
> While looking a patch, I found that a standby with archive_mode=always
> fails to archive segments under certain conditions.

I encountered this issue, too.

> 1. v1-0001-Make-sure-standby-archives-all-segments.patch:
> Fix for A and B.
>
> 2. v1-0001-Make-sure-standby-archives-all-segments-immediate.patch:
> Fix for A, B and C.

You proposed two patches, but this patch should be reviewed preferentially
because this addresses all the issues (i.e., A, B and C) that you reported?

+ * If we are starting streaming at the beginning of a segment,
+ * there may be the case where the previous segment have not been
+ * archived yet. Make sure it is archived.

Could you clarify why the archive notification file of the previous
WAL segment needs to be checked?

As far as I read the code, the cause of the issue seems to be that
XLogWalRcvWrite() exits without creating an archive notification file
even if the current WAL segment is fully written up in the last cycle of
XLogWalRcvWrite()'s loop. That is, creation of the notification file
and WAL archiving of that completed segment will be delayed
until any data in the next segment is received and written (by next call
to XLogWalRcvWrite()). Furthermore, in that case, if walreceiver exits
before receiving such next segment, the completed current segment
fails to be archived as Horiguchi-san reported.

Therefore, IMO that the simple approach to fix the issue is to create
an archive notification file if possible at the end of XLogWalRcvWrite().
I implemented this idea. Patch attached.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
walreceiver_notify_archive_soon.patch text/plain 4.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-08-30 16:58:26 Re: Patch: shouldn't timezone(text, timestamp[tz]) be STABLE?
Previous Message Tom Lane 2021-08-30 16:42:39 Re: Can we get rid of repeated queries from pg_dump?