Quick Links

Re: Race conditions with WAL sender PID lookups

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Race conditions with WAL sender PID lookups
Date:	2017-06-08 04:15:02
Message-ID:	CAB7nPqTfu2P1QCZB22LJG+P4ET-pS0yW+qSTWO9JcVcCtaPZAA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think if you're going to fix it so that we take spinlocks on
> MyWalSnd in a bunch of places that we didn't take them before, it
> would make sense to fix all the places where we're accessing those
> fields without a spinlock instead of only some of them, unless there
> are good reasons why we only need it in some cases and not others, in
> which case I think the patch should add comments to each place where
> the lock was not taken explaining why it's OK. It didn't take me and
> my copy of vi very long to find instances that you didn't change.

I take that you are referring to the two lookups in
WalSndWaitForWal(), one in exec_replication_command(), one in
WalSndLoop() and one in WalSndDone(). First let's remember that all
the fields protected by the spinlock are only updated in a WAL sender
process, so reading them directly is safe in this context: we know
that no other processes will write them. And all those functions are
called only in the context of a WAL sender process. So the copies of
MyWalSnd that you are referring to here don't need a spinlock. Is
there something I am missing?

Actually, perhaps it would make sense to add some Assert(am_walsender)
in this code?

> I also think that should provide some analysis regarding the question
> Thomas asked - are these actually bugs, or are they OK for some
> reason? We shouldn't just plaster the code with spinlock acquisitions
> unless it's really necessary. It seems at least possible that these
> changes could cause performance regressions, and it doesn't look like
> you've done any testing or provided any analysis indicating whether
> that's likely to happen or not.

Now taking a step back on the patch, v3 that I sent 3 weeks ago.
Additional spinlocks are taken in:
1) SyncRepReleaseWaiters(), which is called in a WAL sender. So indeed
no spinlocks are needed here. The last patch is incorrect here.
2) SyncRepGetSyncStandbysPriority() and SyncRepGetSyncStandbysQuorum()
as of HEAD via SyncRepGetSyncStandbys(), something that's not good if
done lock-less as this gets used for pg_stat_replication for
pg_stat_get_wal_senders(). This can return a garbage list of sync
standbys to the client as WAL senders update their flush, write and
apply positions in parallel to that, and don't care about SyncRepLock
as this gets calculated with the standby feedback messages. The
problem here is the gap between walsnd->sync_standby_priority and
walsnd->flush:
-- With a primary, and more or more sync standbys, take a checkpoint
in SyncRepGetSyncStandbysPriority() after the first check is passed.
-- Shutdown the standby and restart it.
-- The standby reconnects, initializes the WAL sender slot with
InitWalSenderSlot().
-- Let the process of SyncRepGetSyncStandbysPriority() continue, the
sync standby does not show up.
That would be really unlikely to happen, but the code is written in
such a way that it could happen. One could also say that this gives a
closer idea that the standby disconnected but it looks wrong to me to
do this value lookup without a required lock.
3) In walreceiver.c's pg_stat_get_wal_receiver's:
- Launch pg_stat_get_wal_receiver and take a checkpoint on it.
- Pass the lookups of pid and ready_to_display.
- Make the WAL receiver stop.
- The view reports inconsistent data. If the WAL receiver data was
just initialized, the caller would get back PID values or similar 0.
Particularly a WAL receiver marked with !ready_to_display could show
data to the caller. That's not cool.
4) KeepFileRestoredFromArchive() needs a lock, as this is called from
the startup process. That's harmless, the worst that can happen is
that needreload is switched to true for a process that has been marked
with a PID of 0 because of a WAL sender restart (slot taken again and
initialized). But that will just process an extra reload in a worst
case.

> Basically, I don't think this patch is ready to commit. We have
> neither evidence that it is necessary nor evidence that it is
> complete. I don't want to be unfair, here, but it seems to me you
> ought to do a little more legwork before throwing this over the wall
> to the pool of committers.

Fair enough. Is the analysis written above more convincing?
--
Michael

Attachment	Content-Type	Size
walsnd-pid-races-v4.patch	application/octet-stream	5.6 KB

In response to

Re: Race conditions with WAL sender PID lookups at 2017-06-07 18:31:13 from Robert Haas

Responses

Re: Race conditions with WAL sender PID lookups at 2017-06-13 02:16:54 from Michael Paquier
Re: Race conditions with WAL sender PID lookups at 2017-06-15 07:51:00 from Kyotaro HORIGUCHI
Re: Race conditions with WAL sender PID lookups at 2017-06-28 22:52:14 from Alvaro Herrera

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mengxing Liu	2017-06-08 05:07:01	Re: Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
Previous Message	Tom Lane	2017-06-08 03:28:02	Does pg_upgrade really support "make installcheck"?