Re: walsender performance regression due to logical decoding on standby changes

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Jeff Davis <pgsql(at)j-davis(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: walsender performance regression due to logical decoding on standby changes
Date: 2023-05-12 11:58:25
Message-ID: CALj2ACVDj+D-nPkzu0f06fMwigBJHwx03FEpfsz427G6AasKWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 10, 2023 at 3:23 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> >> My current guess is that mis-using a condition variable is the best bet. I
> >> think it should work to use ConditionVariablePrepareToSleep() before a
> >> WalSndWait(), and then ConditionVariableCancelSleep(). I.e. to never use
> >> ConditionVariableSleep(). The latch set from ConditionVariableBroadcast()
> >> would still cause the necessary wakeup.
> >
> > How about something like the attached? Recovery and subscription tests
> > don't complain with the patch.
>
> I launched a full Cirrus CI test with it but it failed on one environment (did not look in details,
> just sharing this here): https://cirrus-ci.com/task/6570140767092736

Yeah, v1 had ConditionVariableInit() such that the CV was getting
initialized for every backend as opposed to just once after the WAL
sender shmem was created.

> Also I have a few comments:

Indeed, v1 was a WIP patch. Please have a look at the attached v2
patch, which has comments and passing CI runs on all platforms -
https://github.com/BRupireddy/postgres/tree/optimize_walsender_wakeup_logic_v2.

On Wed, May 10, 2023 at 3:41 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> if (AllowCascadeReplication())
> - WalSndWakeup(switchedTLI, true);
> + ConditionVariableBroadcast(&WalSndCtl->cv);
>
> After the change, we wakeup physical walsender regardless of switchedTLI flag.
> Is this intentional ? if so, I think It would be better to update the comments above this.

That's not the case with the attached v2 patch. Please have a look.

On Thu, May 11, 2023 at 10:27 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> We can have two condition variables for
> logical and physical walsenders, and selectively wake up walsenders
> sleeping on the condition variables. It should work, it seems like
> much of a hack, though.

Andres, rightly put it - 'mis-using' CV infrastructure. It is simple,
works, and makes the WalSndWakeup() easy solving the performance
regression.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v2-0001-Optimize-walsender-wake-up-logic-with-Conditional.patch application/x-patch 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-05-12 12:39:41 Re: psql tests hangs
Previous Message Ajin Cherian 2023-05-12 11:55:46 Re: running logical replication as the subscription owner