Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Date: 2021-12-28 00:40:28
Message-ID: CAHg+QDf8RPOiAb2Qs0f1y6DCG3xchu39bSg6GwXUUbC12+4aMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 25, 2021 at 9:25 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

> On Sun, Dec 26, 2021 at 10:36 AM SATYANARAYANA NARLAPURAM <
> satyanarlapuram(at)gmail(dot)com> wrote:
>
>>
>>> Actually all the WAL insertions are done under a critical section
>>> (except few exceptions), that means if you see all the references of
>>> XLogInsert(), it is always called under the critical section and that is my
>>> main worry about hooking at XLogInsert level.
>>>
>>
>> Got it, understood the concern. But can we document the limitations of
>> the hook and let the hook take care of it? I don't expect an error to be
>> thrown here since we are not planning to allocate memory or make file
>> system calls but instead look at the shared memory state and add delays
>> when required.
>>
>>
> Yet another problem is that if we are in XlogInsert() that means we are
> holding the buffer locks on all the pages we have modified, so if we add a
> hook at that level which can make it wait then we would also block any of
> the read operations needed to read from those buffers. I haven't thought
> what could be better way to do this but this is certainly not good.
>

Yes, this is a problem. The other approach is adding a hook at
XLogWrite/XLogFlush? All the other backends will be waiting behind the
WALWriteLock. The process that is performing the write enters into a busy
loop with small delays until the criteria are met. Inability to process the
interrupts inside the critical section is a challenge in both approaches.
Any other thoughts?

>
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-12-28 01:39:51 Re: sequences vs. synchronous replication
Previous Message Masahiko Sawada 2021-12-28 00:32:13 Re: Allow escape in application_name