Re: heavily contended lwlocks with long wait queues scale badly

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: heavily contended lwlocks with long wait queues scale badly
Date: 2022-11-09 10:24:16
Message-ID: CALj2ACXyZRY=Bisy=HR1NUCf1MHQZy59Mo4V9ny0FTfji+yfHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 1, 2022 at 12:46 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> > Updated patch attached.
>
> Thanks. It looks good to me. However, some minor comments on the v3 patch:
>
> 1.
> - if (MyProc->lwWaiting)
> + if (MyProc->lwWaiting != LW_WS_NOT_WAITING)
> elog(PANIC, "queueing for lock while waiting on another one");
>
> Can the above condition be MyProc->lwWaiting == LW_WS_WAITING ||
> MyProc->lwWaiting == LW_WS_PENDING_WAKEUP for better readability?
>
> Or add an assertion Assert(MyProc->lwWaiting != LW_WS_WAITING &&
> MyProc->lwWaiting != LW_WS_PENDING_WAKEUP); before setting
> LW_WS_WAITING?
>
> 2.
> /* Awaken any waiters I removed from the queue. */
> proclist_foreach_modify(iter, &wakeup, lwWaitLink)
> {
>
> @@ -1044,7 +1052,7 @@ LWLockWakeup(LWLock *lock)
> * another lock.
> */
> pg_write_barrier();
> - waiter->lwWaiting = false;
> + waiter->lwWaiting = LW_WS_NOT_WAITING;
> PGSemaphoreUnlock(waiter->sem);
> }
>
> /*
> * Awaken any waiters I removed from the queue.
> */
> proclist_foreach_modify(iter, &wakeup, lwWaitLink)
> {
> PGPROC *waiter = GetPGProcByNumber(iter.cur);
>
> proclist_delete(&wakeup, iter.cur, lwWaitLink);
> /* check comment in LWLockWakeup() about this barrier */
> pg_write_barrier();
> waiter->lwWaiting = LW_WS_NOT_WAITING;
>
> Can we add an assertion Assert(waiter->lwWaiting ==
> LW_WS_PENDING_WAKEUP) in the above two places? We prepare the wakeup
> list and set the LW_WS_NOT_WAITING flag in above loops, having an
> assertion is better here IMO.
>
> Below are test results with v3 patch. +1 for back-patching it.
>
> HEAD PATCHED
> 1 34142 34289
> 2 72760 69720
> 4 136300 131848
> 8 210809 210192
> 16 240718 242744
> 32 297587 297354
> 64 341939 343036
> 128 383615 383801
> 256 342094 337680
> 512 263194 288629
> 768 145526 261553
> 1024 107267 241811
> 2048 35716 188389
> 4096 12415 120300
>
> PG15 PATCHED
> 1 34503 34078
> 2 73708 72054
> 4 139415 133321
> 8 212396 211390
> 16 242227 242584
> 32 303441 309288
> 64 362680 339211
> 128 378645 344291
> 256 340016 344291
> 512 290044 293337
> 768 140277 264618
> 1024 96191 247636
> 2048 35158 181488
> 4096 12164 118610

I looked at the v3 patch again today and ran some performance tests.
The results look impressive as they were earlier. Andres, any plans to
get this in?

pgbench with SELECT txid_current();:
Clients HEAD PATCHED
1 34613 33611
2 72634 70546
4 137885 136911
8 216470 216076
16 242535 245392
32 299952 304740
64 329788 347401
128 378296 386873
256 344939 343832
512 292196 295839
768 144212 260102
1024 101525 250263
2048 35594 185878
4096 11842 104227

pgbench with insert into pgbench_accounts table:
Clients HEAD PATCHED
1 1660 1600
2 1848 1746
4 3547 3395
8 7330 6754
16 13103 13613
32 26011 26372
64 52331 52594
128 93313 95526
256 127373 126182
512 126712 127857
768 116765 119227
1024 111464 112499
2048 58838 92756
4096 26066 60543

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-11-09 10:25:25 Re: Reviving lost replication slots
Previous Message Amit Kapila 2022-11-09 10:23:23 Re: Reviving lost replication slots