Rethinking -L switch handling and construction of LDFLAGS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Rethinking -L switch handling and construction of LDFLAGS
Date: 2018-04-01 17:38:15
Message-ID: 25214.1522604295@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I noticed that if I build with --with-libxml on my Mac platforms,
"make installcheck" stops working for certain contrib modules such
as postgres_fdw. I finally got around to diagnosing the reason why,
and it goes like this:

1. --with-libxml causes configure to include
-L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib
in the LDFLAGS value put into Makefile.global. That's because
"xml2-config --libs" emits that, and we do need it if we want to link
to the platform-supplied libxml2.

2. However, that directory also contains a symlink to the
platform-supplied libpq.

3. When we go to build postgres_fdw.so, the link command line looks like

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -g -O2 -bundle -multiply_defined suppress -o postgres_fdw.so postgres_fdw.o option.o deparse.o connection.o shippable.o -L../../src/port -L../../src/common -L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib -L/usr/local/ssl/lib -Wl,-dead_strip_dylibs -L../../src/interfaces/libpq -lpq -bundle_loader ../../src/backend/postgres

The details of this might vary depending on your configure options,
but the key point is that the -L/Applications/... switch is before the
-L../../src/interfaces/libpq one. This means that the linker resolves
"-lpq" to the platform-supplied libpq, not the one in the build tree.
We can confirm that with

$ otool -L postgres_fdw.so
postgres_fdw.so:
/usr/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.6.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.50.4)

So, quite aside from any problems stemming from using a 9.3-vintage libpq
with HEAD client code, we are stuck with a libpq that uses Apple's idea of
the default socket location, rather than what the rest of our build uses.
That explains the failures seen in "make installcheck", which look like

2018-04-01 13:09:48.744 EDT [10758] ERROR: could not connect to server "loopback"
2018-04-01 13:09:48.744 EDT [10758] DETAIL: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432"?

Of course, /var/pgsql_socket is *not* where my postmaster is putting
its socket.

In short, we need to deal more honestly with the positioning of -L
switches in link commands. Somebody's idea that we could embed
both -L and -l into $(libpq), and then pay basically no attention to
where that ends up in the final link command, is just too simplistic.

I think that we want to establish an ironclad rule that -L switches
referencing directories in our own build tree must appear before -L
switches referencing external libraries.

I don't have a concrete patch to propose yet, but the design idea
I have in mind is to split LDFLAGS into two or more parts, so that
-L switches for the build tree are supposed to be put in the first
part and external -L switches in the second. It'd be sufficient
to have Makefile.global do something like

ifdef PGXS
LDFLAGS_INTERNAL = -L$(libdir)
else
LDFLAGS_INTERNAL = -L$(top_builddir)/src/port -L$(top_builddir)/src/common
endif
LDFLAGS = $(LDFLAGS_INTERNAL) @LDFLAGS@

and then teach relevant places that they need to add $(libpq) to
LDFLAGS_INTERNAL not LDFLAGS. (Perhaps "BUILD" would be a better keyword
than "INTERNAL" here?) Not sure how that would play exactly with
Makefile.shlib's SHLIB_LINK, but maybe we need SHLIB_LINK_INTERNAL along
with SHLIB_LINK. I'd also like to try to clean up the mess that is
$(libpq_pgport), though I'm not sure just how yet.

Or we could try to create a full separation between -L and -l switches,
ending up with three or more parts for LDFLAGS not just two. But I'm
not sure if that gains anything.

I have no idea whether the MSVC build infrastructure has comparable
problems, and would not be willing to fix it myself if it does.
But I am willing to try to fix this in the gmake infrastructure.

Comments, better ideas?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-04-01 17:43:38 Re: Rethinking -L switch handling and construction of LDFLAGS
Previous Message Andres Freund 2018-04-01 17:32:54 Re: Optimize Arm64 crc32c implementation in Postgresql