Re: Windows: Wrong error message at connection termination

From: Lars Kanis <lars(at)greiz-reinsdorf(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows: Wrong error message at connection termination
Date: 2021-11-27 11:39:00
Message-ID: cfd25a75-7856-0dc4-bba0-49216cb135a9@greiz-reinsdorf.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Am 22.11.21 um 00:04 schrieb Tom Lane:
> Do we know that that actually happens in an arm's-length connection
> (ie two separate machines)? I wonder if the data loss is strictly
> an artifact of a localhost connection. There'd be a lot more pressure
> on them to make cross-machine TCP work per spec, one would think.
> But in any case, if we can avoid sending RST in this situation,
> it seems mostly moot for our usage.

Sorry it took some days to get a setup to check this!

The result is as expected:

1. Windows client to Linux server works without dropping the error message
2. Linux client to Windows server works without dropping the error message
3. Windows client to remote Windows server drops the error message,
depending on the timing of the event loop

In 1. the Linux server doesn't end the connection with a RST packet, so
that the Windows client enqueues the error message properly and doesn't
drop it.

In 2. the Linux client doesn't care about the RST packet of the Windows
server and properly enqueues and raises the error message.

In 3. the combination of the bad RST behavior of client and server leads
to data loss. It depends on the network timing. A delay of 0.5 ms in the
event loop was enough in a localhost setup and as wall as in some LAN
setup. On the contrary over some slower WLAN connection a delay of less
than 15 ms did not loose data, but higher delays still did.

The idea of running a second process, pass the socket handle to it,
observe the parent process and close the socket when it exited, could
work, but I guess it's overly complicated and creates more issues than
it solves. Probably the same if the master process handles the socket
closing.

So I still think it's best to close the socket as proposed in the patch.

--

Regards,
Lars Kanis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-11-27 11:51:53 Re: pg_upgrade and publication/subscription problem
Previous Message Amit Kapila 2021-11-27 10:58:16 Re: Skipping logical replication transactions on subscriber side