| From: | Korry Douglas <korry(dot)douglas(at)enterprisedb(dot)com> | 
|---|---|
| To: | Victor Wagner <vitus(at)wagner(dot)pp(dot)ru> | 
| Cc: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Patch: Implement failover on libpq connect level. | 
| Date: | 2015-12-07 21:12:41 | 
| Message-ID: | 5665F649.1040206@enterprisedb.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers pgsql-jdbc | 
>
>> I've tried to deal with some of these problems.
>>
>> My patch have support for following things:
>>
>> 1. Check whether database instance is in the recovery/standby mode and
>> try to find another one if so.
>> 2. Let cluster management software to have some time to promote one of
>> the standbys to master. I.e. there can be failover timeout specified to
>> allow retry after some time if no working master found.
>>
>> Really there is room for some improvements in handling of connect
>> timeout (which became much more important thing when ability to try
>> next host appears). Now it is handled only by blocking-mode connect
>> functions, not by async state machine. But I decided to publish patch
>> without these improvements to get feedback from community.
> A bit of testing on this turns up a problem.
>
> Consider a connection string that specifies two hosts and a read/write 
> connection:
>
>   postgresql://korry(at)127(dot)0(dot)0(dot)1:5301,127.0.0.1:5300/edb?readonly=0
>
> If the first host is a healthy standby (meaning that I can connect to 
> it but pg_is_in_recovery() returns 't'), the state machine will never 
> move on to the second host.
>
> The problem seems to be in PQconnectPoll() in the case for 
> CONNECTION_AUTH_OK, specifically this code:
>
>   /* We can release the address list now. */
>   pg_freeaddrinfo_all(conn->addrlist_family, conn->addrlist);
>   conn->addrlist = NULL;
>   conn->addr_cur = NULL;
>
> That frees up the list of alternative host addresses.  The state 
> machine then progresses to CONNECTION_CHECK_RO (which invokes 
> pg_is_in_recovery()), then CONNECTION_CHECK_RW (waiting for the 
> response from the server).  Since we just connected to a standby 
> replica, pg_is_in_recovery() returns 't' and the state changes to 
> CONNECTION_NEEDED.  The next call to try_next_address() will fail to 
> find a next address because we freed the list in the case for 
> CONNECTION_AUTH_OK.
>
> A related issue:  when the above sequence occurs, no error message is 
> returned (because the case for CONNECTION_NEEDED thinks "an 
> appropriate error message is already set up").
>
> In short, if you successfully connect to standby replica (and specify 
> readonly=0), the remaining hosts are ignored, even though one of those 
> hosts is a master.
A follow-up - the conn->addrlist is also freed when the case for 
CONNECTION_CHECK_RW decides that conn->status != CONNECTION_OK and calls 
closePGConn().
-- Korry
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2015-12-07 22:43:36 | Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc. | 
| Previous Message | Korry Douglas | 2015-12-07 20:26:33 | Re: Patch: Implement failover on libpq connect level. | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Georg Brünsing | 2015-12-08 10:37:35 | jdbc high memory consumption for RAISE NOTICE output using function-call in Java | 
| Previous Message | Korry Douglas | 2015-12-07 20:26:33 | Re: Patch: Implement failover on libpq connect level. |