Re: WAL replay is too slow on secondary server

From: OMPRAKASH SAHU <sahuop2121(at)gmail(dot)com>
To: Shubhang Joshi <shubhangjoshi2405(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: WAL replay is too slow on secondary server
Date: 2025-10-31 07:47:48
Message-ID: CAOZWJqNR3dxnwn+HGPszQB8BY67_E=eoa7SzArL=t=PMOtUAMQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Everyone,

Thankyou for the suggestions.

I have changed few things from DB side on secondary only till yesterday it
seems fine I will be monitoring it further

Below are the changes:

wal_decode_buffer_size
maintenance_io_concurrency
bgwriter_delay

I checked with AWS support as well if micro bursting had happening but
allocation is enough as per them.

Regards,
OM

On Fri, 31 Oct 2025, 09:54 Shubhang Joshi, <shubhangjoshi2405(at)gmail(dot)com>
wrote:

> Hi OM,
> Hi Laurenz,
>
> Thank you for your insights.
>
> I apologize for my previous suggestion regarding network speed; upon
> further review, it was not the correct cause in this scenario.
>
> Based on the current observations and system metrics, the accumulation of
> WAL on the standby server points to disk I/O limitations during replay—not
> network speed. CPU and RAM usage remain low, and WAL traffic is reaching
> the replica without delay, but replay/apply on disk is slow.
>
> The root cause appears to be disk subsystem performance and the
> single-threaded nature of WAL replay in PostgreSQL recovery. Optimizing
> disk throughput or reconfiguring memory may help, but network latency does
> not seem to be affecting this scenario.
>
> Regards,
> Shubhang
>
> On Thu, 30 Oct 2025 at 17:45, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
>
>> On Thu, 2025-10-30 at 17:08 +0530, Shubhang Joshi wrote:
>> > On Thu, 30 Oct, 2025, 10:07 am OMPRAKASH SAHU, <sahuop2121(at)gmail(dot)com>
>> wrote:
>> > > We have a postgresql cluster setup using patroni.
>> > > The DB is being used for heavy transactional application, now the
>> problem is that on replica server WAL replay is too slow.
>> > > We have increased the IOPS to 6k and Throughput to 600 on nvme EBS
>> volume of wal directory and 10k &800 on data directory.
>> > >
>> > > but the WAL is being accumulated on the replica as usual and applying
>> wal is having no improvement.
>> >
>> > Please check the network speed — we faced a similar issue earlier, and
>> it turned out to be related to network performance.
>> > Kindly verify the network latency with your network team as well.
>>
>> If WAL is piling up on the standby, how can network speed be the problem?
>>
>> Yours,
>> Laurenz Albe
>>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Shardul Borhade 2025-10-31 08:17:22 Question on pg_stat_io showing zero reads/writes for I/O workers
Previous Message Shubhang Joshi 2025-10-31 04:24:40 Re: WAL replay is too slow on secondary server