Re: Expansion of our checks for connection-loss errors

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, masao(dot)fujii(at)oss(dot)nttdata(dot)com
Subject: Re: Expansion of our checks for connection-loss errors
Date: 2020-10-09 01:41:55
Message-ID: 2668788.1602207715@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> At Thu, 08 Oct 2020 15:15:54 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
>> Accordingly, I propose the attached patch (an expansion of
>> Fujii-san's) that causes us to test for all five errnos anyplace
>> we had been checking for ECONNRESET.

> +1 for the direction.

> In terms of connection errors, connect(2) and bind(2) can return
> EADDRNOTAVAIL. bind(2) and listen(2) can return EADDRINUSE. FWIW I
> recetnly saw pgbench getting EADDRNOTAVAIL. (They have mapping from
> respective WSA errors in TranslateSocketError())

I do not think we have any issues with connection-time errors;
or at least, if we do, the spots being touched here certainly
shouldn't need to worry about them. These places are dealing
with already-established connections.

> I'd make errno_is_connection_loss use ALL_CONNECTION_LOSS_ERRNOS to
> avoid duplication definition of the errno list.

Hmm, might be worth doing, but I'm not sure. I am worried about
whether compilers will generate equally good code that way.

> - if (ret < 0 && WSAGetLastError() == WSAECONNRESET)
> + if (ret < 0 && errno_is_connection_loss(WSAGetLastError()))

> Don't we need to use TranslateSocketError() before?

Oh, I missed that. But:

> Perhaps I'm confused but SOCK_ERROR doesn't seem portable between
> Windows and Linux.

In that case, nothing would have worked on Windows for the last
ten years, so you're mistaken. I think the actual explanation
why this works, and why that test in parallel.c probably still
works even with my mistake, is that win32_port.h makes sure that
our values of ECONNRESET etc match WSAECONNRESET etc.

IOW, we'd not actually need TranslateSocketError at all, except
that it maps some not-similarly-named error codes for conditions
that don't exist in Unix into ones that do. We probably do want
TranslateSocketError in this parallel.c test so that anything that
it maps to one of the errno_is_connection_loss codes will be
recognized; but the basic cases would work anyway, unless I
misunderstand this stuff entirely.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-10-09 02:06:25 Re: Assertion failure with LEFT JOINs among >500 relations
Previous Message Kyotaro Horiguchi 2020-10-09 01:05:38 Re: Expansion of our checks for connection-loss errors