Re: Windows: Wrong error message at connection termination

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Lars Kanis <lars(at)greiz-reinsdorf(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows: Wrong error message at connection termination
Date: 2021-11-21 22:33:08
Message-ID: CA+hUKG+HGEb=ED9RHZ6UCwOFNffk9NqkRJRGVy+AA-T4yGYodg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 22, 2021 at 10:42 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Hmm, maybe it's still not enough. Now that I have coffee, I thought
> > about the well known failure of idle_in_transaction_timeout to report
> > errors on Windows[1].
>
> Yeah, I think that may well be a manifestation of the same problem:
> once the backend exits, Winsock issues RST which prevents the client
> from reading the queued data. We had been analyzing that under the
> assumption that Windows obeys the TCP RFCs ... but having now been
> disabused of that optimism, it seems to match up pretty well.
> It'd be useful to check if Lars' patch cures that symptom.

Yeah, it sounds like it might solve at least the server-side problem.
Let's call that weird behaviour #1: RST on process exit. (I wonder if
my keep-the-socket-open-in-another-process thought experiment is
theoretically better: a lingering socket should be capable of
resending data that hasn't been ack'd yet in FIN-WAIT-1 state after
close, which I suspect might not happen if the TCP stack nukes the
socket. If close() avoids the proactive RST but still doesn't really
follow the shutdown protocol then it's papering over a crack in the
wall, but I'm not planning to argue about that...)

IIUC we'd still have weird behaviour #2 on the client side: TCP stack
drops buffered received data on the floor on receipt of RST.

So yeah, it'd be interesting to know if by avoiding/hiding weird
behaviour #1, idle_in_transaction_timeout works as desired most of the
time by tilting the race in favour of eager clients and favourable
scheduling. If a client sends a new query and then immediately begins
to read the response, there's a good chance it'll be able to read the
already-buffered error message before the query->RST ping pong...
Which I now understand is exactly what Lars was explaining: that sync
APIs (like the psql command shown in that other thread) might have a
good chance of winning that race, but for async APIs, the author of
the async API has no idea what its client is going to do.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-11-21 23:04:27 Re: Windows: Wrong error message at connection termination
Previous Message Tom Lane 2021-11-21 21:42:46 Re: Windows: Wrong error message at connection termination