Re: Windows: Wrong error message at connection termination

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Lars Kanis <lars(at)greiz-reinsdorf(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows: Wrong error message at connection termination
Date: 2021-11-17 22:26:57
Message-ID: CA+hUKG+nzfwuKYKXd43jjG2txuc-po4GOyZFA3z6s9oy3P=4kA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 18, 2021 at 10:13 AM Lars Kanis <lars(at)greiz-reinsdorf(dot)de> wrote:
> Unfortunately each connection is closed hard by a Windows PostgreSQL server with TCP flag RST. That in turn is another Winsock API behavior, that is that every socket, that wasn't closed by the application is closed hard with the RST flag at process termination. I didn't find any official documentation about this behavior.

Interesting discovery. I think you might get the same behaviour from
a Unix system if you set SO_LINGER to 0 before you exit[1]. I suppose
if a TCP implementation is partially in user space (I have no idea if
this is true for Windows, I never use it, but I recall that Winsock
was at some point a DLL) and can't handle the existence of any socket
state after the process is gone, you might want to nuke everything and
tell the peer immediately that you're doing so on exit?

I realise now that the experiments we did a while back to try to
understand this across a few different operating systems[2] had missed
this subtlety, because that Python script had an explicit close()
call, whereas PostgreSQL exits. It still revealed that the client
isn't allowed to read any data after its write failed, which is a
known source of error messages being eaten. What I missed is that the
client doesn't just get an RST and enter this
no-you-can't-have-the-error-message-I-have-received state in response
to data sent by the client (the usual way you expect to get RST), like
in that test, but it also does so proactively when the server process
exits, as you've explained (in other words, it's not necessary for the
client to try to write to reach this error-eating state).

[1] https://stackoverflow.com/questions/3757289/when-is-tcp-option-so-linger-0-required
[2] https://www.postgresql.org/message-id/flat/20190306030706.GA3967%40f01898859afd.ant.amazon.com#32f9f16f9be8da5ee5c3b405d6d1829c

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-11-17 22:28:43 Re: Patch: Range Merge Join
Previous Message Tom Lane 2021-11-17 22:01:37 Re: Windows: Wrong error message at connection termination