Re: Infinite loop in XLogPageRead() on standby

From: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org, thomas(dot)munro(at)gmail(dot)com
Subject: Re: Infinite loop in XLogPageRead() on standby
Date: 2024-02-29 16:44:25
Message-ID: CAFh8B==zUj1+asN5REAvqJccgUZFgOh5Ze9c=mOrGypRuTEm=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Kyotaro,

On Thu, 29 Feb 2024 at 08:18, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
wrote:

In the first place, it's important to note that we do not guarantee
> that an async standby can always switch its replication connection to
> the old primary or another sibling standby. This is due to the
> variations in replication lag among standbys. pg_rewind is required to
> adjust such discrepancies.
>

Sure, I know. But in this case the async standby received and flushed
absolutely the same amount of WAL as the promoted one.

>
> I might be overlooking something, but I don't understand how this
> occurs without purposefully tweaking WAL files. The repro script
> pushes an incomplete WAL file to the archive as a non-partial
> segment. This shouldn't happen in the real world.
>

It easily happens if the primary crashed and standbys didn't receive
another page with continuation record.

In the repro script, the replication connection of the second standby
> is switched from the old primary to the first standby after its
> promotion. After the switching, replication is expected to continue
> from the beginning of the last replayed segment.

Well, maybe, but apparently the standby is busy trying to decode a record
that spans multiple pages, and it is just infinitely waiting for the next
page to arrive. Also, the restart "fixes" the problem, because indeed it is
reading the file from the beginning.

> But with the script,
> the second standby copies the intentionally broken file, which differs
> from the data that should be received via streaming.

As I already said, this is a simple way to emulate the primary crash while
standbys receiving WAL.
It could easily happen that the record spans on multiple pages is not fully
received and flushed.

--
Regards,
--
Alexander Kukushkin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2024-02-29 16:45:07 Re: Atomic ops for unlogged LSN
Previous Message Dean Rasheed 2024-02-29 16:37:28 Re: Supporting MERGE on updatable views