|From:||Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>|
|To:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|Cc:||Andres Freund <andres(at)anarazel(dot)de>, Andreas Seltenreich <seltenreich(at)gmx(dot)de>, pgsql-hackers(at)postgresql(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>|
|Subject:||Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Fixed the pstrdup problem by replacing with strlcpy() to stack-allocated
variables (rather than palloc + memcpy as proposed in Michael's patch).
About the other problems:
Tom Lane wrote:
> I took a quick look through walreceiver.c, and there are a couple of
> obvious problems of the same ilk in WalReceiverMain, which is doing this:
> walrcv->lastMsgSendTime = walrcv->lastMsgReceiptTime = walrcv->latestWalEndTime = GetCurrentTimestamp();
> (ie, a potential kernel call) inside a spinlock. But there seems no
> real problem with just collecting the timestamp before we enter that
> critical section.
Done that way.
> I also don't especially like the fact that just above there it reaches
> elog(PANIC) with the lock still held, though at least that's a case that
> should never happen.
Fixed by releasing spinlock just before elog.
> Further down, it's doing a pfree() inside the spinlock, apparently
> for no other reason than to save one "if (tmp_conninfo)".
> I don't especially like the Asserts inside spinlocks, either. Personally,
> I think if those conditions are worth testing then they're worth testing
> for real (in production). Variables that are manipulated by multiple
> processes are way more likely to assume unexpected states than local
I didn't change these. It doesn't look to me that these asserts are
worth very much as production code.
> I'm also rather befuddled by the fact that this code sets and clears
> walrcv->latch outside the critical sections. If that field is used
> by any other process, surely that's completely unsafe. If it isn't,
> why is it being kept in shared memory?
I think the latch is only used locally. Seems that it was only put in
shmem to avoid a separate variable ...
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
|Next Message||Sokolov Yura||2017-10-03 14:30:03||Two pass CheckDeadlock in contentent case|
|Previous Message||Tom Lane||2017-10-03 14:03:25||Re: [PATCH] Improve geometric types|