Regression tests fail with musl libc because libpq.so can't be loaded

From: Wolfgang Walther <walther(at)technowledgy(dot)de>
To: PostgreSQL Bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Regression tests fail with musl libc because libpq.so can't be loaded
Date: 2024-03-16 12:38:48
Message-ID: fddd1cd6-dc16-40a2-9eb5-d7fef2101488@technowledgy.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Running the regression tests when building with musl libc fails, with
errors like the following:

ERROR: could not load library
"<builddir>/tmp_install/usr/lib/postgresql/libpqwalreceiver.so": Error
loading shared library libpq.so.5: No such file or directory (needed by
<builddir>/tmp_install/usr/lib/postgresql/libpqwalreceiver.so)

This was observed in Alpine Linux [1] and nixpkgs [2] a few years ago. I
now looked at this a bit and this is what happens:

- The temporary install location is set via LD_LIBRARY_PATH in the
regression tests, so that postgres can find those libs.

- All tests which load an extension / shared module via dlopen() fail,
when the loaded library in turn depends on another library in
tmp_install - I think in practice it's libpq.so all the time.

- LD_LIBRARY_PATH is used correctly to look for the direct dependency
loaded in dlopen(...), but is not taken into account anymore when trying
to locate libpq.so. This step only fails with musl, but works fine with
glibc.

I can reproduce this with a simple Dockerfile (attached), which uses the
library/postgres-alpine image, moves libpq.so to a different folder and
points LD_LIBRARY_PATH at it. Build and run the dockerfile like this:

docker build . -t pg-musl && docker run --rm pg-musl

This Dockerfile can easily be adjusted to work with the debian image -
which shows that doing the same with glibc works just fine.

Even though this originated in "just" the regression tests, I'm filing
this as a bug, because:
- The docs explicitly mention LD_LIBRARY_PATH support to point at a
different /lib folder in [3].
- This can clearly break outside the test-suite as shown with the
Dockerfile.

I tried a few more things:
- When I add an /etc/ld-musl-$(ARCH).path file and add the path to
libpq.so's libdir to it, libpq.so can be found.
- When I add the path to libpq.so as an rpath to the postgres binary,
libpq.so can be found.

Both is not surprising, but just confirms musl-ld actually works as
expected. It's just LD_LIBRARY_PATH that seems to not be passed on.

To rule out a musl bug, I also put together a very simple test-case of
an executable loading liba with dlopen(), which depends on libb and then
constructing the same scenario with LD_LIBRARY_PATH. This works fine
when compiled with glibc and musl, too. Thus, I believe the problem to
be somewhere in how postgres loads those libraries.

Best,

Wolfgang

[1]:
https://github.com/alpinelinux/aports/commit/d67ceb66a1ca9e1899071c9ef09fffba29fa0417#diff-2bd25b5172fc52319de1b09086ac0db6314d2e9fa73497979f5198f8caaec1b9

[2]:
https://github.com/NixOS/nixpkgs/commit/09ffd722072291f00f2a54d7404eb568a15e562a

[3]: https://www.postgresql.org/docs/current/install-post.html

Attachment Content-Type Size
Dockerfile text/plain 369 bytes

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-03-16 14:24:43 Re: Regression tests fail with musl libc because libpq.so can't be loaded
Previous Message Amit Kapila 2024-03-16 11:51:59 Re: Re:RE: Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-03-16 13:31:02 Re: speed up a logical replica setup
Previous Message Wolfgang Walther 2024-03-16 11:48:31 Building with meson on NixOS/nixpkgs