Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Date: 2022-02-01 20:15:41
Message-ID: CA+hUKGKUcgaB4hOrcMLWYQchBHZVqp=kwerc-8uH6ZW3ZfvX+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 2, 2022 at 6:38 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2022-02-01 18:02:34 +1300, Thomas Munro wrote:
> > 1. "pgsocket" could become a pointer to a heap-allocated wrapper
> > object containing { socket, event, flags } on Windows, or something
> > like that, but that seems a bit invasive and tangled up with public
> > APIs like libpq, which put me off trying that. I'm willing to explore
> > it if people object to my other idea.
>
> I'm not sure if the libpq aspect really is a problem. We're not going to have
> to do that conversion repeatedly, I think.

Alright, I'm prototyping that variant today.

> > Provide a way to get a callback when a socket is created or closed.
> >
> > XXX TODO handle callback failure
> > XXX TODO investigate overheads/other implications of having a callback
> > installed
>
> What do we need this for? I still don't understand what kind of reconnects we
> need to automatically need to detect.

libpq makes new sockets in various cases like when trying multiple
hosts/ports (the easiest test to set up) or in some SSL and GSSAPI
cases. In the model shown in the most recent patch where there is a
hash table holding ExtraSocketState objects that libpq doesn't even
know about, we have to do SocketTableDrop(old socket),
SocketTableAdd(new socket) at those times, which is why I introduced
that callback.

If we switch to the model where a socket is really a pointer to a
wrapper struct (which I'm about to prototype), the need for all that
bookkeeping goes away, no callbacks, no hash table, but now libpq has
to participate knowingly in a socket wrapping scheme to help the
backend while also somehow providing unwrapped SOCKET for client API
stability. Trying some ideas, more on that soon.

> > +#if !defined(FRONTEND)
> > +struct ExtraSocketState
> > +{
> > +#ifdef WIN32
> > + HANDLE event_handle; /* one event for the life of the socket */
> > + int flags; /* most recent WSAEventSelect() flags */
> > + bool seen_fd_close; /* has FD_CLOSE been received? */
> > +#else
> > + int dummy; /* none of this is needed for Unix */
> > +#endif
> > +};
>
> Seems like we might want to track more readiness events than just close? If we
> e.g. started tracking whether we've seen writes blocking / write readiness,
> we could get rid of cruft like
>
> /*
> * Windows does not guarantee to log an FD_WRITE network event
> * indicating that more data can be sent unless the previous send()
> * failed with WSAEWOULDBLOCK. While our caller might well have made
> * such a call, we cannot assume that here. Therefore, if waiting for
> * write-ready, force the issue by doing a dummy send(). If the dummy
> * send() succeeds, assume that the socket is in fact write-ready, and
> * return immediately. Also, if it fails with something other than
> * WSAEWOULDBLOCK, return a write-ready indication to let our caller
> * deal with the error condition.
> */
>
> that seems likely to just make bugs less likely, rather than actually fix them...

Yeah. Unlike FD_CLOSE, FD_WRITE is a non-terminal condition so would
also need to be cleared, which requires seeing all
send()/sendto()/write() calls with wrapper functions, but we already
do stuff like that. Looking into it...

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Imseih (AWS), Sami 2022-02-01 20:33:16 Re: Add index scan progress to pg_stat_progress_vacuum
Previous Message Stephen Frost 2022-02-01 20:12:28 Re: Support for NSS as a libpq TLS backend