Quick Links

Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Andreas Seltenreich <seltenreich(at)gmx(dot)de>, pgsql-hackers(at)postgresql(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject:	Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM
Date:	2017-10-03 16:58:23
Message-ID:	6526.1507049903@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> So that's trouble waiting to happen, for sure. At the very least we
> need to do a single fetch of WalRcv->latch, not two. I wonder whether
> even that is sufficient, though: this coding requires an atomic fetch of
> a pointer, which is something we generally don't assume to be safe.
>
> I'm inclined to think that it'd be a good idea to move the set and
> clear of the latch field into the nearby spinlock critical sections,
> and then change WalRcvForceReply to look like ...

Concretely, as per the attached.

I reordered the WalRcvData fields to show that the "latch" field is now
treated as protected by the spinlock. In the back branches, we shouldn't
do that, just in case some external code is touching the mutex field.

regards, tom lane

Attachment	Content-Type	Size
fix-unsafe-accesses-to-WalRcv-latch.patch	text/x-diff	3.8 KB

In response to

Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM at 2017-10-03 16:30:02 from Tom Lane

Responses

Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM at 2017-10-03 17:44:12 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2017-10-03 17:01:58	Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message	Alvaro Herrera	2017-10-03 16:48:20	Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple