| From: | Evgeny Kuzin <evgeny(dot)kuzin(at)outlook(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch |
| Date: | 2026-03-05 14:59:21 |
| Message-ID: | AM9PR09MB49001C4CAA8C9B2E4EC947F9977DA@AM9PR09MB4900.eurprd09.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Tom,
Thanks for the feedback. I should clarify the use case - we're not mixing read-write and read-only hosts under one DNS name by accident. This is intentional for HA failover.
We run a PostgreSQL clusters with streaming replication. After a failover, the old primary becomes a standby and vice versa. The challenge is: how do clients find the new primary?
Current options:
1. Update DNS on every failover - operationally complex, TTL delays, requires automation
2. Consul/etcd - adds operational complexity and another failure domain
3. Multiple hosts in connection string - requires application changes when cluster topology changes (e.g., adding a new standby)
The proposed approach:
* Single A-record (db.internal) pointing to all cluster member IPs
* Clients connect with host=db.internal target_session_attrs=read-write
* libpq tries each IP until it finds the primary
IIUC this is how JDBC's targetServerType=primary works - it iterates through all resolved addresses. The "useless connection attempts" are actually the feature: it's probing to find the right server, same as when you specify multiple hosts explicitly.
The only difference from host=pg1,pg2,pg3 is that DNS provides the list instead of the connection string. From libpq's perspective, why should it matter where the address list came from?
________________________________
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Thursday, March 5, 2026 2:55 PM
To: Evgeny Kuzin <evgeny(dot)kuzin(at)outlook(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Evgeny Kuzin <evgeny(dot)kuzin(at)outlook(dot)com> writes:
> We've been running into an issue with "target_session_attrs" when using dns-based service discovery. Currently, when libpq connects to a host with multiple A-records and the connection succeeds but is rejected due to target_session_attrs mismatch (e.g., connecting to a read-only server with target_session_attrs=read-write), it skips all remaining addresses for that hostname and moves directly to the next host in the connection string.
> Looking at git history, I found this was a deliberate choice by Robert Haas in commit 721f7bd3cbc (2016), where he noted "I changed Mithun's patch to skip all remaining IPs for a host if we reject a connection based on this new parameter." The original mailing list discussion is at [1], though I wasn't able to find a clear explanation of why this approach was preferred over trying all addresses.
> This makes it impractical to use a single multi-A-record DNS name pointing to all cluster members with target_session_attrs=read-write to find the primary - only the first responding IP is tried before giving up on that hostname.
> The attached patch changes the behavior to try all addresses for a hostname before moving to the next host, matching the existing behavior for connection failures. This would enable simpler DNS-based service discovery without requiring external tools like Consul or explicit multi-host connection strings.
TBH, I'd say that your DNS setup is broken and you should fix it.
It makes no sense to have the same DNS entry pointing to both
read-write and read-only hosts. The proposed patch will mainly
result in useless connection attempts in more-sanely-constructed
setups.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-03-05 15:04:00 | Re: Improve checks for GUC recovery_target_xid |
| Previous Message | Tom Lane | 2026-03-05 14:55:55 | Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch |