Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Date: 2022-01-09 19:06:19
Message-ID: CA+hUKGKwiqnkmuj6tJ6E+wBDDDhB3d6_Lzmm3O=ZhRb9P8ETAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 10, 2022 at 12:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> Going down through the call chain, I see that at the end of it
> WaitForMultipleObjects() hangs while waiting for the primary connection
> socket event. So it looks like the socket, that is closed by the
> primary, can get into a state unsuitable for WaitForMultipleObjects().

I wonder if FD_CLOSE is edge-triggered, and it's already told us once.
I think that's what these Python Twisted guys are saying:

https://stackoverflow.com/questions/7598936/how-can-a-disconnected-tcp-socket-be-reliably-detected-using-msgwaitformultipleo

> I tried to check the socket state with the WSAPoll() function and
> discovered that it returns POLLHUP for the "problematic" socket.

Good discovery. I guess if the above theory is right, there's a
memory somewhere that makes this level-triggered as expected by users
of poll().

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-01-09 19:10:22 Re: Non-superuser subscription owners
Previous Message Tom Lane 2022-01-09 18:59:02 Re: [PATCH] Prefer getenv("HOME") to find the UNIX home directory