Quick Links

Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()

From:	Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()
Date:	2018-07-10 09:33:20
Message-ID:	7d6863ed-cf3f-b144-b6c9-101995ca9bd9@2ndquadrant.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 09.07.18 15:49, Heikki Linnakangas wrote:
> The way
> forward is to test if we can get the same performance benefit from
> switching to CMPXCHG16B, and keep the WAL format unchanged.

I have implemented this. I didn't see any performance benefit using the
benchmark that Alexander Kuzmenkov outlined earlier in the thread. (But
I also didn't see a great benefit for the originally proposed patch, so
maybe I'm not doing this right or this particular hardware doesn't
benefit from it as much.)

I'm attaching the patches and scripts here if someone else wants to do
more testing.

The first patch adds a zoo of 128-bit atomics support. It's consistent
with (a.k.a. copy-and-pasted from) the existing 32- and 64-bit set, but
it's not the complete set, only as much as was necessary for this exercise.

The second patch then makes use of that in the WAL code under discussion.

pgbench invocations were:

pgbench -i -I t bench
pgbench -n -c $N -j $N -M prepared -f pgbench-wal-cas.sql -T 60 bench

for N from 1 to 32.

Note: With gcc (at least versions 7 and 8) you need to use some
non-default -march setting to get 128-bit atomics to work. (Otherwise
the configure test fails and the fallback implementation is used.) I
have found the minimum to be -march=nocona. But different -march
settings actually affect the benchmark performance, so be sure to test
the baseline with the same -march setting. Recommended configure
invocation: ./configure ... CC='gcc -march=whatever'

clang appears to work out of the box.

Also, curiously my gcc installations provide 128-bit
__sync_val_compare_and_swap() but not 128-bit
__atomic_compare_exchange_n() in spite of what the documentation indicates.

So independent of whether this approach actually provides any benefit,
the 128-bit atomics support seems a bit wobbly.

(I'm also wondering why we are using __sync_val_compare_and_swap()
rather than __sync_bool_compare_and_swap(), since all we're doing with
the return value is emulate the bool version anyway.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment	Content-Type	Size
0001-Add-some-int128-atomics-support.patch	text/plain	16.7 KB
0002-Reduce-WAL-spinlock-contention.patch	text/plain	9.0 KB
pgbench-wal-cas.sql	text/plain	2.3 KB

In response to

Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation() at 2018-07-09 13:49:36 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	PG Doc comments form	2018-07-10 09:34:36	Typo in doc or wrong EXCLUDE implementation
Previous Message	Emre Hasegeli	2018-07-10 09:32:27	Re: [PATCH] Improve geometric types