Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch

From: Evgeny Kuzin <evgeny(at)hudson-trading(dot)com>
To: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
Cc: Andrew Jackson <andrewjackson947(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Evgeny Kuzin <evgeny(dot)kuzin(at)outlook(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Date: 2026-05-23 19:05:58
Message-ID: 41F87F28-F5CD-4ADB-B4D3-33A39927CBAF@hudson-trading.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I spent some more time digging into this to make sure I was not overlooking something fundamental in the resolver behavior here.

I agree that POSIX does not specify getaddrinfo() as a dns rrset api. It is a socket address selection api, and the specification leaves room for address family filtering, ordering, mapped addresses, AI_ADDRCONFIG, and other system policy decisions [1].

But I think there is an important distinction between:

- getaddrinfo() is not specified to preserve exact DNS semantics
- getaddrinfo() will normally lose arbitrary dns answers

The second conclusion does not seem to follow from the first one.

In the specific A/AAAA case discussed here, the linux libc implementations I checked generally do expose the full usable RRset returned by the resolver unless there is an explicit policy reason not to (AI_ADDRCONFIG, v4mapped handling, requested address family, resolver policy, etc).

This behavior is also consistent with dns resolver semantics themselves. Rfc1035 defines truncation handling via the TC bit, and rfc1123 requires retrying over tcp when truncation occurs [2][3]. In the ordinary dns case, I would therefore not expect a conforming resolver stack to silently hand libc an arbitrary partial RRset.

The cases where getaddrinfo() may legitimately omit addresses are mostly the same cases where connection behavior is already policy-sensitive anyway:

- mixed ipv4/v6 environments
- AI_ADDRCONFIG filtering
- v4mapped handling
- resolver policy rules
- non dns nss sources

Those are not really random losses of dns data. They are explicit host resolution and connectivity policy decisions.

What concerns me more about introducing a dns client inside libpq is that we would no longer be following the same resolver path as the rest of the system. That is user-visible behavior, not merely an implementation detail.

For example, it risks bypassing or changing behavior around:

- /etc/hosts
- nsswitch.conf
- mdns
- ldap integration
- systemd-resolved policy
- split dns
- vpn-specific resolver routing
- container/runtime-specific resolution

The current behavior may not be theoretically perfect from a dns abstraction perspective, but it is operationally well-understood and consistent with existing unix networking expectations.

My concern is that we may be trading a relatively narrow theoretical weakness in the getaddrinfo() contract for a much broader compatibility and behavioral change in existing deployments.

[1] https://pubs.opengroup.org/onlinepubs/9799919799/functions/getaddrinfo.html
[2] https://datatracker.ietf.org/doc/html/rfc1035
[3] https://datatracker.ietf.org/doc/html/rfc1123#section-6.1.3.2

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aaryan Parik 2026-05-23 19:18:22 [PATCH] psql: Display SQLSTATE macro name in verbose error reports
Previous Message SATYANARAYANA NARLAPURAM 2026-05-23 18:59:44 Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL