Re: pg_basebackup WAL streamer shutdown is bogus - leading to slow tests

From: Andres Freund <andres(at)anarazel(dot)de>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_basebackup WAL streamer shutdown is bogus - leading to slow tests
Date: 2022-01-16 23:28:00
Message-ID: 20220116232800.wawflyaal6q45e4y@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-01-16 17:39:11 +0100, Magnus Hagander wrote:
> On Sun, Jan 16, 2022 at 5:36 PM Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> >
> > On Sun, Jan 16, 2022 at 5:34 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >
> > > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > > I don't immediately see a solution for this, other than to add
> > > > StreamCtl->stop_event (mirroring ->stop_socket) and then convert
> > > > CopyStreamPoll() to use WaitForMultipleObjects(). Microsoft's select()
> > > > doesn't support pipes and there's no socketpair().
> > > > Any more straightforward ideas?
> > > > From a cursory look at history, it used to be that pg_basebackup had this
> > > > behaviour on all platforms, but it got fixed for other platforms in
> > > > 7834d20b57a by Tom (assuming the problem wasn't present there).
> > >
> > > Hmm --- I see that I thought Windows was unaffected, but I didn't
> > > consider this angle.
> > >
> > > Can we send the child process a signal to kick it off its wait?
> >
> > No. (1) on Windows it's not a child process, it's a thread. And (2)
> > Windows doesn't have signals. We emulate those *in the backend* for
> > win32, but this problem is in the frontend where that emulation layer
> > doesn't exist.
>
> [...] which I think brings us back to the original suggestion of
> WSAEventSelect().

I hacked that up last night. And a fix or two later, it seems to be
working. What I'd missed at first is that the event needs to be reset in
reached_end_position(), otherwise we'll busy loop.

I wonder if using a short-lived event handle would have dangers of missing
FD_CLOSE here as well? It'd probably be worth avoiding the risk by creating
the event just once.

I just wasn't immediately sure where to stash it. Probably just by adding a
field in StreamCtl, that ReceiveXlogStream() then sets? So far it's constant
once passed to ReceiveXlogStream(), but I don't really see a reason why it'd
need to stay that way?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-01-16 23:31:18 Re: pg_basebackup WAL streamer shutdown is bogus - leading to slow tests
Previous Message Tom Lane 2022-01-16 22:53:44 Re: fix crash with Python 3.11