Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Дмитрий Дегтярёв <degtyaryov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Date: 2013-10-01 00:51:50
Message-ID: CA+CSw_uTn-KmKCQvGGWQXJ=FR4_43QwjX4xLFLNX8X5Hd9H8yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 28, 2013 at 12:53 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> What confuses me is that pg_read_barrier() is just a compiler barrier on
> x86[-64] in barrier.h. According to my knowledge it needs to be an
> lfence or the full barrier?
> The linked papers from Paul McKenney - which are a great read - seem to
> agree?

x86 provides a pretty strong memory model, the only (relevant) allowed
reordering is that stores may be delayed beyond subsequent loads. A
simple and functionally accurate model of this is to ignore all caches
and think of the system as a bunch of CPU's accessing the main memory
through a shared bus, but each CPU has a small buffer that stores
writes done by that CPU. Intel and AMD have performed feats of black
magic in the memory pipelines that this isn't a good performance
model, but for correctness it is valid. The exceptions are few
unordered memory instructions that you have to specifically ask for
and non-writeback memory that concerns the kernel. (See section 8.2 in
[1] for details) The note by McKenney stating that lfence is required
for a read barrier is for those unordered instructions. I don't think
it's necessary in PostgreSQL.

As for the specific patch being discussed here. The read barrier is in
the wrong place and with the wrong comment, and the write side is
assuming that SpinLockAcquire() is a write barrier, which it isn't
documented to do at the moment. The correct way to think of this is
that StartupXLOG() does a bunch of state modifications and then
advertises the fact that it's done by setting
xlogctl->SharedRecoveryInProgress = false; The state modifications
should better be visible to anyone seeing that last write, so you need
one write barrier between the state modifications and setting the
flag. On the other side, after reading the flag as false in
RecoveryInProgress() the state modified by StartupXLOG() may be
investigated, those loads better not happen before we read the flag.
So we need a read barrier somewhere *after* reading the flag in
RecoveryInProgress() and reading the shared memory structures, and in
theory a full barrier if we are going to be writing data. In practice
x86 is covered thanks to it's memory model, Power is covered thanks to
the control dependency and ARM would need a read barrier, but I don't
think any real ARM CPU does speculative stores as that would be
insane.

[1] http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2013-10-01 01:11:36 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Previous Message Noah Misch 2013-09-30 23:33:51 Re: dynamic shared memory