Quick Links

Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch

From:	Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To:	Evgeny Kuzin <evgeny(dot)kuzin(at)outlook(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Date:	2026-03-10 22:18:00
Message-ID:	6919b4d51c5aa36f6de8f99c1874fab58dae40eb.camel@cybertec.at
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, 2026-03-05 at 14:59 +0000, Evgeny Kuzin wrote:
> We run a PostgreSQL clusters with streaming replication. After a failover, the old primary
> becomes a standby and vice versa. The challenge is: how do clients find the new primary?
>
> Current options:
> 1. Update DNS on every failover - operationally complex, TTL delays, requires automation

Your proposal would also suffer from TTL delays in the case of a cluster reconfiguration.

>    2. Consul/etcd - adds operational complexity and another failure domain
>    3. Multiple hosts in connection string - requires application changes when cluster
> topology changes (e.g., adding a new standby)
>
> The proposed approach:
> * Single A-record (db.internal) pointing to all cluster member IPs
> * Clients connect with
>    host=db.internal target_session_attrs=read-write
> * libpq tries each IP until it finds the primary
>
> IIUC this is how JDBC'stargetServerType=primary works - it iterates through all resolved
> addresses. The "useless connection attempts" are actually the feature: it's probing to
> find the right server, same as when you specify multiple hosts explicitly.
> The only difference fromhost=pg1,pg2,pg3 is that DNS provides the list instead of the
> connection string. From libpq's perspective, why should it matter where the address list came from?

I see the point of your proposal.

One example of what Tom worries about is "localhost" resolving to both "127.0.0.1" and "::1",
a very common case. With the proposed change, any connection attempt to "localhost" that fails
would now take twice as long to fail. Also, if the problem is authentication, the server would
perform two authentication attempts. That is a clear regression that may affect many people.

The question is whether the overall benefits of your proposal (which certainly makes sense
in a setup like you describe) would be worth a performance and resource usage regression like
the one I described above. Or can you see a way to modify your approach so that that problem
can be avoided?

Yours,
Laurenz Albe

In response to

Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch at 2026-03-05 14:59:21 from Evgeny Kuzin

Responses

Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch at 2026-03-11 10:01:15 from Andrey Borodin
Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch at 2026-03-11 14:29:39 from Evgeny Kuzin

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2026-03-10 22:29:48	Re: index prefetching
Previous Message	Heikki Linnakangas	2026-03-10 22:10:38	Re: Refactor recovery conflict signaling a little