Re: Some 9.5beta2 backend processes not terminating properly?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Shay Rojansky <roji(at)roji(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Some 9.5beta2 backend processes not terminating properly?
Date: 2015-12-30 17:37:34
Message-ID: 20151230173734.hx7jj2fnwyljfqek@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-12-30 12:30:43 -0500, Tom Lane wrote:
> Nor OS X. Ugh. My first thought was that ac1d7945f broke this, but
> that's only in HEAD not 9.5, so some earlier change must be responsible.

The backtrace in
http://archives.postgresql.org/message-id/CADT4RqBo79_0Vx%3D-%2By%3DnFv3zdnm_-CgGzbtSv9LhxrFEoYMVFg%40mail.gmail.com
seems to indicate that it's really WaitLatchOrSocket() not noticing the
socket is closed.

For a moment I had the theory that Port->sock might be invalid because
it somehow got closed. That'd then remove the socket from the waited-on
events, which would explain the behaviour. But afaics that's really only
possible via pq_init()'s on_proc_exit(socket_close, 0); And I can't see
how that could be reached.

FWIW, the
if (sock == PGINVALID_SOCKET)
wakeEvents &= ~(WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE);
block in both latch implementations looks like a problem waiting to happen.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shay Rojansky 2015-12-30 17:38:23 Re: Some 9.5beta2 backend processes not terminating properly?
Previous Message Tom Lane 2015-12-30 17:37:15 Re: Some 9.5beta2 backend processes not terminating properly?