Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Date: 2023-11-10 03:31:27
Message-ID: CA+hUKGL0bikWSC2XW-zUgFWNVEpD_gEWXndi2PE5tWqmApkpZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here is a new attempt to fix this mess. Disclaimer: this based
entirely on reading the manual and vicariously hacking a computer I
don't have via CI.

The two basic ideas are:

* keep per-socket event handles in a hash table
* add our own level-triggered event memory

The socket table entries are reference counted, and exist as long as
the socket is currently in at least one WaitEventSet. When creating a
new entry, extra polling logic re-checks the initial level-triggered
state (an overhead that we had in an ad-hoc way already, and that can
be avoided by more widespread use of long lived WaitEventSet). You
are not allowed to close a socket while it's in a WaitEventSet,
because then a new socket could be allocated with the same number and
chaos would ensue. For example, if we revive the idea of hooking
libpq connections up to long-lived WaitEventSets, we'll probably need
to invent a libpq event callback that says 'I am going to close socket
X!', so you have a chance to remove the socket from any WaitEventSet
*before* it's closed, to maintain that invariant. Other lazier ideas
are possible, but probably become impossible in a hypothetical
multi-threaded future.

With these changes, AFAIK it should be safe to reinstate graceful
socket shutdowns, to fix the field complaints about FATAL error
messages being eaten by a grue and the annoying random CI/BF failures.

Here are some other ideas that I considered but rejected for now:

1. We could throw the WAIT_USE_WIN32 code away, and hack
WAIT_USE_POLL to use WSAPoll() on Windows; we could create a
'self-pipe' using a pair of connected AF_UNIX sockets to implement
latches and fake signals. It seems like a lot of work, and makes
latches a bit worse (instead of "everything is an event!" we have
"everything is a socket!" with a helper thread, and we don't even have
socketpair() on this OS). Blah.

2. We could figure out how to do fancy asynchronous sockets and IOCP.
That's how NT really wants to talk to the world, it doesn't really
want to pretend to be Unix. I expect that is where we'll get to
eventually but it's a much bigger cross-platform R&D job.

3. Maybe there is a kind of partial step towards idea 2 that Andres
mentioned on another thread somewhere: one could use an IOCP, and then
use event callbacks that run on system threads to post IOCP messages
(a bit like we do for our fake waitpid()).

What I have here is the simplest way I could see to patch up what we
already have, with the idea that in the fullness of time we'll
eventually get around to idea 2, once someone is ready to do the
press-ups.

Review/poking-with-a-stick/trying-to-break-it most welcome.

Attachment Content-Type Size
v2-0001-simplehash-Allow-raw-memory-to-be-freed.patch text/x-patch 1.6 KB
v2-0002-simplehash-Allow-raw-allocation-to-fail.patch text/x-patch 2.9 KB
v2-0003-Redesign-Windows-socket-event-management.patch text/x-patch 22.1 KB
v2-0004-Remove-pgwin32_select.patch text/x-patch 7.5 KB
v2-0005-Refactor-pgwin32_waitforsinglesocket-to-share-eve.patch text/x-patch 4.8 KB
v2-0006-Reinstate-graceful-shutdown-changes-for-Windows.patch text/x-patch 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2023-11-10 03:33:50 Re: Add new option 'all' to pg_stat_reset_shared()
Previous Message shveta malik 2023-11-10 03:31:07 Re: Synchronizing slots from primary to standby