Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andreas Seltenreich <seltenreich(at)gmx(dot)de>, pgsql-hackers(at)postgresql(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM
Date: 2017-10-03 16:58:23
Message-ID: 6526.1507049903@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> So that's trouble waiting to happen, for sure. At the very least we
> need to do a single fetch of WalRcv->latch, not two. I wonder whether
> even that is sufficient, though: this coding requires an atomic fetch of
> a pointer, which is something we generally don't assume to be safe.
>
> I'm inclined to think that it'd be a good idea to move the set and
> clear of the latch field into the nearby spinlock critical sections,
> and then change WalRcvForceReply to look like ...

Concretely, as per the attached.

I reordered the WalRcvData fields to show that the "latch" field is now
treated as protected by the spinlock. In the back branches, we shouldn't
do that, just in case some external code is touching the mutex field.

regards, tom lane

Attachment Content-Type Size
fix-unsafe-accesses-to-WalRcv-latch.patch text/x-diff 3.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-10-03 17:01:58 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message Alvaro Herrera 2017-10-03 16:48:20 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple