Re: heavily contended lwlocks with long wait queues scale badly

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: heavily contended lwlocks with long wait queues scale badly
Date: 2022-11-03 18:21:18
Message-ID: 8680af2d-44ca-8d5e-d246-0db590b00a02@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/1/22 1:41 PM, Andres Freund wrote:

>> Andres: when you suggested backpatching, were you thinking of the Nov 2022
>> release or the Feb 2023 release?
>
> I wasn't thinking that concretely. Even if we decide to backpatch, I'd be very
> hesitant to do it in a few days.

Yeah this was my thinking (and also why I took a few days to reply given
the lack of urgency for this release). It would at least give some more
time for others to test it to feel confident that we're not introducing
noticeable regressions.

> <goes and runs test while in meeting>
>
>
> I tested with browser etc running, so this is plenty noisy. I used the best of
> the two pgbench -T21 -P5 tps, after ignoring the first two periods (they're
> too noisy). I used an ok-ish NVMe SSD, rather than the the expensive one that
> has "free" fsync.
>
> synchronous_commit=on:
>
> clients master fix
> 16 6196 6202
> 64 25716 25545
> 256 90131 90240
> 1024 128556 151487
> 2048 59417 157050
> 4096 32252 178823
>
>
> synchronous_commit=off:
>
> clients master fix
> 16 409828 409016
> 64 454257 455804
> 256 304175 452160
> 1024 135081 334979
> 2048 66124 291582
> 4096 27019 245701
>
>
> Hm. That's a bigger effect than I anticipated. I guess sc=off isn't actually
> required, due to the level of concurrency making group commit very
> effective.
>
> This is without an index, serial column or anything. But a quick comparison
> for just 4096 clients shows that to still be a big difference if I create an
> serial primary key:
> master: 26172
> fix: 155813

🤯 (seeing if my exploding head makes it into the archives).

Given the lack of ABI changes (hesitant to say low-risk until after more
testing, but seemingly low-risk), I can get behind backpatching esp if
we're targeting Feb 2023 so we can tests some more.

With my advocacy hat on, it bums me that we may not get as much buzz
about this change given it's not in a major release, but 1/ it'll fix an
issue that will help users with high-concurrency and 2/ users would be
able to perform a simpler update to get the change.

Thanks,

Jonathan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-11-03 19:44:16 Re: [BUG] parenting a PK constraint to a self-FK one (Was: Self FK oddity when attaching a partition)
Previous Message Andrew Dunstan 2022-11-03 18:16:51 Re: ssl tests aren't concurrency safe due to get_free_port()