From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Is the unfair lwlock behavior intended? |
Date: | 2016-05-24 22:50:15 |
Message-ID: | 20160524225015.4zj5hkdizuzedcjj@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2016-05-24 15:34:31 -0700, Peter Geoghegan wrote:
> On Tue, May 24, 2016 at 1:38 PM, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee> wrote:
> >> I've already observed such behavior, see [1]. I think that now there is no
> >> consensus on how to fix that. For instance, Andres express opinion that
> >> this shouldn't be fixed from LWLock side [2].
> >> FYI, I'm planning to pickup work on CSN patch [3] for 10.0. CSN should fix
> >> various scalability issues including high ProcArrayLock contention.
> >
> > Some amount of non-fairness is ok, but degrading to the point of
> > complete denial of service is not very graceful. I don't think it's
> > realistic to hope that all lwlock contention issues will be fixed any
> > time soon. Some fallback mechanism would be extremely nice until then.
>
> Jim Gray's paper on the "Convoy phenomenon" remains relevant, decades later:
>
> http://www.msr-waypoint.com/en-us/um/people/gray/papers/Convoy%20Phenomenon%20RJ%202516.pdf
>
> I could believe that there's a case to be made for per-LWLock fairness
> characteristics, which may be roughly what Andres meant.
The problem is that half-way fair locks, which are frequently acquired
both in shared and exclusive mode, have really bad throughput
characteristics on modern multi-socket systems. We mostly get away with
fair locking on object level (after considerable work re fast-path
locking), because nearly all access are non-conflicting. But
prohibiting any snapshot acquisitions when there's a single LW_EXCLUSIVE
ProcArrayLock waiter, can reduce throughput dramatically.
I don't think randomly processing the wait queue - which is what the
quoted paper essentially describes - is really useful here. We
intentionally *ignore* the wait queue entirely if a lock is not
conflicting, and that's what can prohibit exclusive locks from ever
succeeding, because you essentially can get repetitions of:
S1: acq(SHARED) -> shared = 1
S2: acq(EXCLUSIVE) -> shared = 1, waiters = 1 <block>
...
S3: acq(SHARED) -> shared = 2
S1: rel(SHARED) -> shared = 1
S1: acq(SHARED) -> shared = 2
S3: rel(SHARED) -> shared = 1
...
Now we potentially could mark individual lwlocks as being fair
locks. But which ones would those be? Certainly not ProcArrayLock, it's
way too heavily contended.
Regards,
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-05-24 22:52:19 | Re: statistics for shared catalogs not updated when autovacuum is off |
Previous Message | Peter Geoghegan | 2016-05-24 22:34:31 | Re: Is the unfair lwlock behavior intended? |