Re: LogwrtResult contended spinlock

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>
Subject: Re: LogwrtResult contended spinlock
Date: 2021-02-02 23:19:19
Message-ID: 20210202231919.GA23288@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

So I addressed about half of your comments in this version merely by
fixing silly bugs. The problem I had which I described as
"synchronization fails" was one of those silly bugs.

So in further thinking, it seems simpler to make only LogwrtResult
atomic, and leave LogwrtRqst as currently, using the spinlock. This
should solve the contention problem we saw at the customer (but I've
asked Jaime very nicely to do a test run, if possible, to confirm).

For things like logical replication, which call GetFlushRecPtr() very
frequently (causing the spinlock issue we saw) it is good, because we're
no longer hitting the spinlock at all in that case.

I have another (pretty mechanical) patch that renames LogwrtResult.Write
to LogWriteResult and LogwrtResult.Flush to LogFlushResult. That more
clearly shows that we're no longer updating them on unison. Didn't want
to attach here because I didn't rebase on current one. But it seems
logical: there's no longer any point in doing struct assignment, which
is the only thing that stuff was good for.

On 2021-Jan-29, Andres Freund wrote:

> > @@ -1169,6 +1169,8 @@ XLogInsertRecord(XLogRecData *rdata,
> > {
> > /* advance global request to include new block(s)
> > */
> > pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, EndPos);
> > + pg_memory_barrier();
>
> That's not really useful - the path that actually updates already
> implies a barrier. It'd probably be better to add a barrier to a "never
> executed cmpxchg" fastpath.

Got it. Do you mean in pg_atomic_monotonic_advance_u64() itself? I'm
not sure which is the nicer semantics. (If it's got to be at the
caller, then we'll need to return a boolean from there, which sounds
worse.)

> > @@ -2905,6 +2909,7 @@ XLogFlush(XLogRecPtr record)
> > WriteRqstPtr = pg_atomic_read_u64(&XLogCtl->LogwrtRqst.Write);
> > LogwrtResult.Write = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Write);
> > LogwrtResult.Flush = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Flush);
> > + pg_read_barrier();
>
> I'm not sure you really can get away with just a read barrier in these
> places. We can't e.g. have later updates to other shared memory
> variables be "moved" to before the barrier (which a read barrier
> allows).

Ah, that makes sense.

I have not really studied the barrier locations terribly closely in this
version of the patch. It probably misses some (eg. in GetFlushRecPtr
and GetXLogWriteRecPtr). It is passing the tests for me, but alone
that's probably not enough. I'm gonna try and study the generated
assembly and see if I can make sense of things ...

--
Álvaro Herrera 39°49'30"S 73°17'W

Attachment Content-Type Size
0001-add-pg_atomic_monotonic_advance_u64.patch text/x-diff 1.1 KB
0002-make-LogwrtResult-atomic.patch text/x-diff 8.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-02-02 23:26:15 Re: Recording foreign key relationships for the system catalogs
Previous Message Mark Dilger 2021-02-02 23:10:34 Re: new heapcheck contrib module