| From: | OMPRAKASH SAHU <sahuop2121(at)gmail(dot)com> |
|---|---|
| To: | Shubhang Joshi <shubhangjoshi2405(at)gmail(dot)com> |
| Cc: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-admin(at)lists(dot)postgresql(dot)org |
| Subject: | Re: WAL replay is too slow on secondary server |
| Date: | 2025-10-31 07:47:48 |
| Message-ID: | CAOZWJqNR3dxnwn+HGPszQB8BY67_E=eoa7SzArL=t=PMOtUAMQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-admin |
Hi Everyone,
Thankyou for the suggestions.
I have changed few things from DB side on secondary only till yesterday it
seems fine I will be monitoring it further
Below are the changes:
wal_decode_buffer_size
maintenance_io_concurrency
bgwriter_delay
I checked with AWS support as well if micro bursting had happening but
allocation is enough as per them.
Regards,
OM
On Fri, 31 Oct 2025, 09:54 Shubhang Joshi, <shubhangjoshi2405(at)gmail(dot)com>
wrote:
> Hi OM,
> Hi Laurenz,
>
> Thank you for your insights.
>
> I apologize for my previous suggestion regarding network speed; upon
> further review, it was not the correct cause in this scenario.
>
> Based on the current observations and system metrics, the accumulation of
> WAL on the standby server points to disk I/O limitations during replay—not
> network speed. CPU and RAM usage remain low, and WAL traffic is reaching
> the replica without delay, but replay/apply on disk is slow.
>
> The root cause appears to be disk subsystem performance and the
> single-threaded nature of WAL replay in PostgreSQL recovery. Optimizing
> disk throughput or reconfiguring memory may help, but network latency does
> not seem to be affecting this scenario.
>
> Regards,
> Shubhang
>
> On Thu, 30 Oct 2025 at 17:45, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
>
>> On Thu, 2025-10-30 at 17:08 +0530, Shubhang Joshi wrote:
>> > On Thu, 30 Oct, 2025, 10:07 am OMPRAKASH SAHU, <sahuop2121(at)gmail(dot)com>
>> wrote:
>> > > We have a postgresql cluster setup using patroni.
>> > > The DB is being used for heavy transactional application, now the
>> problem is that on replica server WAL replay is too slow.
>> > > We have increased the IOPS to 6k and Throughput to 600 on nvme EBS
>> volume of wal directory and 10k &800 on data directory.
>> > >
>> > > but the WAL is being accumulated on the replica as usual and applying
>> wal is having no improvement.
>> >
>> > Please check the network speed — we faced a similar issue earlier, and
>> it turned out to be related to network performance.
>> > Kindly verify the network latency with your network team as well.
>>
>> If WAL is piling up on the standby, how can network speed be the problem?
>>
>> Yours,
>> Laurenz Albe
>>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Shardul Borhade | 2025-10-31 08:17:22 | Question on pg_stat_io showing zero reads/writes for I/O workers |
| Previous Message | Shubhang Joshi | 2025-10-31 04:24:40 | Re: WAL replay is too slow on secondary server |