Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver
Date: 2017-03-04 06:45:58
Message-ID: 4144256b-dc4f-892c-a720-71f6e7b9ac87@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 04/03/17 05:11, Tom Lane wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
>> On 3/3/17 19:16, Tom Lane wrote:
>>> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>>>> Use asynchronous connect API in libpqwalreceiver
>
>>> Buildfarm member bowerbird has been failing in the pg_rewind test since
>>> this patch went in. It looks like it's failing to complete connections
>>> from the standby; which suggests that something platform-specific is
>>> missing from this commit, but I dunno what.
>
>> Hmm, I wonder how widely tested the async connection API is on Windows
>> at all. I only see bowerbird and jacana running bin-check on Windows.
>
> Yeah, I was wondering if this is just exposing a pre-existing bug.
> However, the "normal" path operates by repeatedly invoking PQconnectPoll
> (cf. connectDBComplete) so it's not immediately obvious how such a bug
> would've escaped detection.
>

I can see one difference though (I didn't see this code before) and that
is, the connectDBComplete starts with waiting for socket to become
writable and only then calls PQconnectPoll, while my patch starts with
PQconnectPoll call. And I see following comment in connectDBstart
> /*
> * The code for processing CONNECTION_NEEDED state is in PQconnectPoll(),
> * so that it can easily be re-executed if needed again during the
> * asynchronous startup process. However, we must run it once here,
> * because callers expect a success return from this routine to mean that
> * we are in PGRES_POLLING_WRITING connection state.
> */

So I guess I implemented it wrong in a subtle way that breaks on windows.

If that's the case, the attached should fix it, but I have no way of
testing it on windows, I can only say that it still works on my machine
so at least it hopefully does not make things worse.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Reorder-the-asynchronous-libpq-calls-for-replication.patch text/plain 1.9 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2017-03-04 19:48:51 pgsql: pg_dump: Fix ordering
Previous Message Peter Eisentraut 2017-03-04 04:43:34 pgsql: Disallow CREATE/DROP SUBSCRIPTION in transaction block

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-04 06:46:55 Re: Logical replication existing data copy
Previous Message Robert Haas 2017-03-04 06:43:44 Re: 2017-03 Commitfest In Progress