From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Andreas Seltenreich <seltenreich(at)gmx(dot)de>, pgsql-hackers(at)postgresql(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com> |
Subject: | Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM |
Date: | 2017-10-03 17:44:12 |
Message-ID: | 23454.1507052652@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
>> So that's trouble waiting to happen, for sure. At the very least we
>> need to do a single fetch of WalRcv->latch, not two. I wonder whether
>> even that is sufficient, though: this coding requires an atomic fetch of
>> a pointer, which is something we generally don't assume to be safe.
BTW, I had supposed that this bug was of long standing, but actually it's
new in v10, dating to 597a87ccc9a6fa8af7f3cf280b1e24e41807d555. Before
that walreceiver start/stop just changed the owner of a long-lived shared
latch, and there was no question of stale pointers.
I considered reverting that decision, but the reason for it seems to have
been to let libpqwalreceiver.c manipulate MyProc->procLatch rather than
having to know about a custom latch. That's probably a sufficient reason
to justify some squishiness in the wakeup logic. Still, we might want to
revisit it if we find any other problems here.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2017-10-03 17:55:36 | Add TOAST to system tables with ACL? |
Previous Message | Adrien Nayrat | 2017-10-03 17:43:15 | Re: Possible SSL improvements for a newcomer to tackle |