From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Subject: | RE: walsender performance regression due to logical decoding on standby changes |
Date: | 2023-05-15 02:19:53 |
Message-ID: | OS0PR01MB571699A82CFFACA41FC1B71294789@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Friday, May 12, 2023 7:58 PM Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Wed, May 10, 2023 at 3:23 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > >> My current guess is that mis-using a condition variable is the best
> > >> bet. I think it should work to use
> > >> ConditionVariablePrepareToSleep() before a WalSndWait(), and then
> > >> ConditionVariableCancelSleep(). I.e. to never use
> > >> ConditionVariableSleep(). The latch set from
> ConditionVariableBroadcast() would still cause the necessary wakeup.
> > >
> > > How about something like the attached? Recovery and subscription
> > > tests don't complain with the patch.
> >
> > I launched a full Cirrus CI test with it but it failed on one
> > environment (did not look in details, just sharing this here):
> > https://cirrus-ci.com/task/6570140767092736
>
> Yeah, v1 had ConditionVariableInit() such that the CV was getting initialized for
> every backend as opposed to just once after the WAL sender shmem was
> created.
>
> > Also I have a few comments:
>
> Indeed, v1 was a WIP patch. Please have a look at the attached v2 patch, which
> has comments and passing CI runs on all platforms -
> https://github.com/BRupireddy/postgres/tree/optimize_walsender_wakeup_
> logic_v2.
>
> On Wed, May 10, 2023 at 3:41 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> wrote:
> >
> > if (AllowCascadeReplication())
> > - WalSndWakeup(switchedTLI, true);
> > + ConditionVariableBroadcast(&WalSndCtl->cv);
> >
> > After the change, we wakeup physical walsender regardless of switchedTLI
> flag.
> > Is this intentional ? if so, I think It would be better to update the comments
> above this.
>
> That's not the case with the attached v2 patch. Please have a look.
Thanks for updating the patch. I did some simple primary->standby replication test for the
patch and can see the degradation doesn't happen in the replication after applying it[1].
One nitpick in the comment:
+ * walsenders. It makes WalSndWakeup() callers life easy.
callers life easy => callers' life easy.
[1]
max_wal_senders = 100
before regression(ms) after regression(ms) v2 patch(ms)
13394.4013 14141.2615 13455.2543
Compared with before regression 5.58% 0.45%
max_wal_senders = 200
before regression(ms) after regression(ms) v2 patch(ms)
13280.8507 14597.1173 13632.0606
Compared with before regression 9.91% 1.64%
max_wal_senders = 300
before regression(ms) after regression(ms) v2 patch(ms)
13535.0232 16735.7379 13705.7135
Compared with before regression 23.65% 1.26%
Best Regards,
Hou zj
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Chavez | 2023-05-15 02:50:45 | Re: 'converts internal representation to "..."' comment is confusing |
Previous Message | Thomas Munro | 2023-05-15 02:08:32 | Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~ |