Re: Unnecessary delay in streaming replication due to replay lag

From: sunil s <sunilfeb26(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: soumyadeep2007(at)gmail(dot)com, bharath(dot)rupireddyforpostgres(at)gmail(dot)com, daniel(at)yesql(dot)se, michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org, lchch1990(at)sina(dot)cn, masahiko(dot)sawada(at)2ndquadrant(dot)com, hawu(at)pivotal(dot)io, a(dot)lubennikova(at)postgrespro(dot)ru, ashwinstar(at)gmail(dot)com
Subject: Re: Unnecessary delay in streaming replication due to replay lag
Date: 2025-07-10 06:35:07
Message-ID: CAOG6S49YhpfALy2SsYLxiNR8zWxyPmhGc7hZZtYLsZHAxNcn4A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Added patch to upcoming commitfest
https://commitfest.postgresql.org/patch/5908/

Thanks & Regards,
Sunil S

On Wed, Jul 9, 2025 at 12:01 AM sunil s <sunilfeb26(at)gmail(dot)com> wrote:

> Hello Hackers,
>
> I recently had the opportunity to continue the effort originally led by a
> valued contributor.
> I’ve addressed most of the previously reported feedback and issues, and
> would like to share the updated patch with the community.
>
> IMHO starting WAL receiver eagerly offers significant advantages because
> of following reasons
>
> 1.
>
> If recovery_min_apply_delay is set high (for various operational
> reasons) and the primary crashes, the mirror can recover quickly, thereby
> improving overall High Availability.
> 2.
>
> For setups without archive-based recovery, restore and recovery
> operations complete faster.
> 3.
>
> When synchronous_commit is enabled, faster mirror recovery reduces
> offline time and helps avoid prolonged commit/query wait times during
> failover/recovery.
> 4.
>
> This approach also improves resilience by limiting the impact of
> network interruptions on replication.
>
>
> > In common cases, I believe archive recovery is faster than
> replication. If a segment is available from archive, we don't need to
> prefetch it via stream.
>
> I completely agree — restoring from the archive is significantly faster
> than streaming.
> Attempting to stream from the last available WAL in the archive would
> introduce complexity and risk.
> Therefore, we can limit this feature to crash recovery scenarios and skip
> it when archiving is enabled.
>
> > The "FATAL: could not open file" message from walreceiver means that
> the walreceiver was operationally prohibited to install a new wal
> segment at the time.
> This was caused by an additional fix added in upstream to address a race
> condition between the archiver and checkpointer.
> It has been resolved in the latest patch, which also includes a TAP test
> to verify the fix. Thanks for testing and bringing this to our attention.
> For now we will skip wal receiver early start since enabling the write
> access for wal receiver will reintroduce the bug, which the
> commit cc2c7d65fc27e877c9f407587b0b92d46cd6dd16
> <https://github.com/postgres/postgres/commit/cc2c7d65fc27e877c9f407587b0b92d46cd6dd16> fixed
> previously.
>
>
> I've attached the rebased patch with the necessary fix.
>
> Thanks & Regards,
> Sunil S (Broadcom)
>
>
> On Tue, Jul 8, 2025 at 11:01 AM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
> wrote:
>
>> At Wed, 15 Dec 2021 17:01:24 -0800, Soumyadeep Chakraborty <
>> soumyadeep2007(at)gmail(dot)com> wrote in
>> > Sure, that makes more sense. Fixed.
>>
>> As I played with this briefly. I started a standby from a backup that
>> has an access to archive. I had the following log lines steadily.
>>
>>
>> [139535:postmaster] LOG: database system is ready to accept read-only
>> connections
>> [139542:walreceiver] LOG: started streaming WAL from primary at
>> 0/2000000 on timeline 1
>> cp: cannot stat '/home/horiguti/data/arc_work/000000010000000000000003':
>> No such file or directory
>> [139542:walreceiver] FATAL: could not open file
>> "pg_wal/000000010000000000000003": No such file or directory
>> cp: cannot stat '/home/horiguti/data/arc_work/00000002.history': No such
>> file or directory
>> cp: cannot stat '/home/horiguti/data/arc_work/000000010000000000000003':
>> No such file or directory
>> [139548:walreceiver] LOG: started streaming WAL from primary at
>> 0/3000000 on timeline 1
>>
>> The "FATAL: could not open file" message from walreceiver means that
>> the walreceiver was operationally prohibited to install a new wal
>> segment at the time. Thus the walreceiver ended as soon as started.
>> In short, the eager replication is not working at all.
>>
>>
>> I have a comment on the behavior and objective of this feature.
>>
>> In the case where archive recovery is started from a backup, this
>> feature lets walreceiver start while the archive recovery is ongoing.
>> If walreceiver (or the eager replication) worked as expected, it would
>> write wal files while archive recovery writes the same set of WAL
>> segments to the same directory. I don't think that is a sane behavior.
>> Or, if putting more modestly, an unintended behavior.
>>
>> In common cases, I believe archive recovery is faster than
>> replication. If a segment is available from archive, we don't need to
>> prefetch it via stream.
>>
>> If this feature is intended to use only for crash recovery of a
>> standby, it should fire only when it is needed.
>>
>> If not, that is, if it is intended to work also for archive recovery,
>> I think the eager replication should start from the next segment of
>> the last WAL in archive but that would invite more complex problems.
>>
>> regards.
>>
>> --
>> Kyotaro Horiguchi
>> NTT Open Source Software Center
>>
>>
>>
>>
>>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chandy G 2025-07-10 07:18:39 Insights into duplicate records seen in snapshot & logical replication slot.
Previous Message Dilip Kumar 2025-07-10 06:17:36 Re: A recent message added to pg_upgade