Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Дмитрий Дегтярёв <degtyaryov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Date: 2013-10-02 14:45:07
Message-ID: CA+CSw_uB5U8Wm1-t_GNHm382y7i+bPd=CasMErTGqN_u+j3KqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 2, 2013 at 4:39 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Mon, Sep 30, 2013 at 7:51 PM, Ants Aasma <ants(at)cybertec(dot)at> wrote:
>> So we need a read barrier somewhere *after* reading the flag in
>> RecoveryInProgress() and reading the shared memory structures, and in
>> theory a full barrier if we are going to be writing data.
>
> wow -- thanks for your review and provided detail. Considering there
> are no examples of barrier instructions to date, I think some of your
> commentary should be included in the in-source documentation.
>
> In this particular case, a read barrier should be sufficient? By
> 'writing data', do you mean to the xlog control structure? This
> routine only sets a backend local flag so that should be safe?

I haven't reviewed the code in as much detail to say if there is an
actual race here, I tend to think there's probably not, but the
specific pattern that I had in mind is that with the following actual
code:

Process A:
globalVar = 1;
write_barrier();
recoveryInProgress = false;

Process B:
if (!recoveryInProgress) {
globalVar = 2;
doWork();
}

If process B speculatively executes line 2 before reading the flag for
line 1, then it's possible that the store in process B is executed
before the store in process A, making globalVar move backwards. The
barriers as they are defined don't make this scenario impossible. That
said, I don't know of any hardware that would make speculatively
executed stores visible to non-speculative state, as I said, that
would be completely insane. However currently compilers consider it
completely legal to rewrite the code into the following form:

tmp = globalVar;
globalVar = 2;
if (!recoveryInProgress) {
doWork();
} else {
globalVar = tmp;
}

That still exhibits the same problem. An abstract read barrier would
not be enough here, as this requires a LoadStore barrier. However, the
control dependency is enough for the hardware and PostgreSQL
pg_read_barrier() is a full compiler barrier, so in practice a simple
pg_read_barrier() is enough.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-10-02 14:56:38 Re: logical changeset generation v6.1
Previous Message Magnus Hagander 2013-10-02 14:31:49 Prevent pg_basebackup -Fp -D -?