Re: Patch: Implement failover on libpq connect level.

From: Victor Wagner <vitus(at)wagner(dot)pp(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: Implement failover on libpq connect level.
Date: 2016-09-09 08:49:53
Message-ID: 20160909114953.10e702e4@fafnir.local.vm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

On Fri, 9 Sep 2016 11:20:59 +0530
Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:

> If concern is only about excluding address which was tried
> previously. Then I think we can try this way, randomly permute given
> address list (for example fisher-yates shuffle) before trying any of
> those address in it. After that you can treat the new list exactly
> like we do for sequential connections. I think this makes code less
> complex.

Random permutation is much more computationally complex than random
picking.

Alexander suggests to use other random pick algorithm than I use.

I use algorithm which works with lists of unknown length and chooses
line of such list with equal probability. This algorithm requires one
random number for each element of the list, but we are using C library
rand function which uses quite cheap linear congruental pseudo random
numbers, so random number generation is infinitely cheap than network
connection attempt.

As this algorithm doesn't need beforehand knowledge of the list length,
it is easy to wrap this algorithm into loop, which modifies the list,
moving already tried elements out of scope of next picking.
(it is really quite alike fisher-yates algorithm).

Alexander suggests to store somewhere number of elements of the list,
than generate one random number and pick element with this number.

Unfortunately, it doesn't save much effort in the linked list.
At average we'll need to scan half of the list to find desired element.
So, it is O(N) anyway.

Now, the main point is that we have two different timeframes of
operation.

1. Attempt to connect. It is short period, and we can assume that state
of servers doesn't change. So if, server is down, we should ignore it
until the end of this attempt to connect.

2. Attempt to failover. If no good candidate was found, we might need
to retry connection until one of standbys would be promoted to master
(with targetServerType=master) or some host comes up. See documentation
of failover_timeout configuration parameter.

It means that we cannot just throw away those hosts from list which
didn't respond during this connect attempt.

So, using random permutation instead of random pick wouln't help.
And keeping somewhere number of elements in the list wouldn't help
either unless we change linked list to completely different data
structure which allows random access.
--

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-09-09 08:55:48 Re: Declarative partitioning - another take
Previous Message Andreas 'ads' Scherbaum 2016-09-09 08:48:50 Re: to_date_valid()

Browse pgsql-jdbc by date

  From Date Subject
Next Message Jonathan S. Katz 2016-09-09 17:52:53 Re: [webmaster] Link on https://jdbc.postgresql.org/download.html results in 404
Previous Message Mithun Cy 2016-09-09 05:50:59 Re: Patch: Implement failover on libpq connect level.