| From: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Cc: | Daniel Schreiber <daniel(dot)schreiber(at)hrz(dot)tu-chemnitz(dot)de> |
| Subject: | Re: PostgreSQL 17: Bug in libpq when libpq is dlopened/closed multiple times |
| Date: | 2026-04-22 18:29:04 |
| Message-ID: | CAOYmi+kac3wEE3iqxHfHCNd_n2i-Or=n+Qk8_G24UZn2uz3DyQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs pgsql-hackers |
[moving to -hackers]
On Fri, Apr 17, 2026 at 12:14 PM Jacob Champion
<jacob(dot)champion(at)enterprisedb(dot)com> wrote:
>
> On Fri, Apr 17, 2026 at 7:33 AM Daniel Schreiber
> <daniel(dot)schreiber(at)hrz(dot)tu-chemnitz(dot)de> wrote:
> > my colleagues and I probably found a bug in libpq when libpq is dlopened
> > and closed multiple times during the lifetime of a process. In our setup
> > we use a PAM module which links to libpq. The process using PAM is
> > linked against openssl, so openssl is loaded during the complete
> > lifetime of the process whereas libpq is loaded only during PAM
> > authentication (and unloaded when PAM has finished).
> >
> > [snip]
> >
> > According to our findings every time a connection is established after
> > dlopening libpq one of the 127 available BIO_METHOD structures in
> > OpenSSL is consumed:
> > https://github.com/postgres/postgres/blob/REL_17_9/src/interfaces/libpq/fe-secure-openssl.c#L1987
>
> Right. I think in this *particular* case, we should simply skip the
> call to BIO_get_new_index(). We don't need it, IIUC.
Attached is a proposal to do that.
> But I think we may also need to set expectations on whether or not
> infinite dlopen/dlclose loops are supported in general. If we ever
> come across a situation in which a call to BIO_get_new_index() is
> necessary, that leak just fundamentally can't be plugged. The same is
> true for any third-party libraries (or their dependencies, or
> theirs...) that require "one-time", irreversible calls which can't be
> tracked after we're unloaded. And we can't push these concerns up to
> the top level application developer, because they don't know we exist.
>
> (I'd be surprised if this were the only such resource leak across all
> supported versions and combinations of Kerberos, OpenSSL, OpenLDAP,
> Curl, etc. etc. From a quick search, you're the first to report this
> in the ten years since the leak was introduced, so there may be more
> dragons where you're headed.)
If anyone has thoughts on that, I'd love to hear them. I don't mind
removing this unnecessary code in HEAD, or even backpatching as a
courtesy -- but if it were up to me, I would not guarantee zero global
resource leaks across libpq and its entire dependency graph. (Even if
we magically had control over all those dependencies, I think it'd
still be reasonable for libpq devs to use "allocate once and move on"
patterns... and I want to continue using those in my new code.)
Thanks,
--Jacob
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-Remove-call-to-BIO_get_new_index-in-OpenSSL-code.patch | application/octet-stream | 3.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nico Williams | 2026-04-22 19:22:09 | Re: PostgreSQL 17: Bug in libpq when libpq is dlopened/closed multiple times |
| Previous Message | Andrei Lepikhov | 2026-04-22 17:23:39 | Re: TRAP: failed Assert("offsets[i] > offsets[i - 1]"), File: "tidstore.c" |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Paul A Jungwirth | 2026-04-22 18:37:22 | Re: Inconsistent trigger behavior between two temporal leftovers |
| Previous Message | Paul A Jungwirth | 2026-04-22 18:03:18 | Re: FOR PORTION OF does not recompute GENERATED STORED columns that depend on the range column |