Re: Shared PostgreSQL libraries and symbol versioning

From: Christoph Berg <myon(at)debian(dot)org>
To: Pavel Raiskup <praiskup(at)redhat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Shared PostgreSQL libraries and symbol versioning
Date: 2018-05-24 10:08:16
Message-ID: 20180524100816.GA21320@msg.df7cb.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Re: Pavel Raiskup 2018-05-24 <101829257(dot)NN0XsvVvxK(at)nb(dot)usersys(dot)redhat(dot)com>
> > Interesting, thanks. How this is implemented? What you mean by "newer
> > library" -- new soname, or just a newer package build (without any other
> > change) e.g.?
>
> I've already done some observation how libraries are handled by Debian :-)
> sorry for my ignorance. If I got it right, if some packages provide
> some SONAME then those must also provide the *.shlibs and *.symbols files
> (in repo metadadata, apt-cache `ls -1 /var/lib/dpkg/info/libpq*.s*`) and
> that info is then used to generate safe dependencies for other packages
> (I mean the 'Depends' field in e.g. `apt-cache show python-psycopg2`).

The whole system relies on upstream getting the SONAME right by not
breaking the ABI, i.e. symbols must only be added, never removed, and
never changed in semantics. For a given program with a set of symbols
used, dpkg-shlibdeps then reads the *.symbols file to determine the
minimum package version that this set of symbols is compatible with
when linking against a shared object.
(*.shlibs is the predecessor of the *.symbols system which is inferior
because it only tracks a global minimum compatible version, instead of
looking at each symbol individually.)

> The thing is that those metadata files can be either generated
> automatically (by dpkg-gensymbols at package build-time?) or have to be
> maintained manually, which is the case of PostgreSQL packaging, see e.g.
> change for added symbol in v10:
> https://salsa.debian.org/postgresql/postgresql/commit/c3eba5c8d04177f81a7b4a043302a209f3d2d2e7

The *.symbols files maintain themselves automatically via
dpkg-gensymbols at build time, but it makes sense to edit the result
for more accurate (or nicer) information. For example in the change
your linking to, dpkg-gensymbols might have used
"10~beta1-1" as minimum version, but it's better to
change that to 10~~ because it's shorter, and because all version 10
packages feature that symbol, and beta1 might just have been the first
package version built.

> So basically, Debian packagers seem take care of the symbol versioning
> downstream, and they could have this solved automatically if upstream
> tarball allowed the symbol versioning (so they could leverage the
> `dpkg-gensymbols` thing, instead of the manual work). Can anyone confirm?

No. Versioned symbols in shared objects have the advantage that they
allow upstream to incompatibly change symbols without bumping the
SONAME (provided the old version is still shipped), but they don't
remove the need to declare a dependency on the *package* version where
this symbol was introduced. That is not possible to extract from the
upstream symbol version information, it needs to be handled on the
packaging side.

> It's similar with RPM, except that the tooling is seemingly less powerful;
> we have to have the symbol versions baked into *.so file, otherwise it
> simply doesn't work. It has some pros, :-) it motivates us more to solve
> it upstream, if possible. But yeah, since we'll face the same problem in
> Fedora/RHEL/CentOS -- we'll have to have the problem solved somehow
> (at least downstream) ...

How does RPM solve the "depend on this package version" problem? By
declaring "Provides: PQencryptPasswordConn(at)Base" in the .so's package
for each symbol?

> > I don't really want to push it upstream; I'm just saying that we consider
> > this to be important enough to go downstream-fix only way. At the same
> > time, I'm convinced having it upstream is almost trivial change and worth
> > having... So I try to offer a help.

If you do this downstream-only, it might create a giant maintenance
burden that you have to carry on forever, I'd think.

Christoph

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yuzuko Hosoya 2018-05-24 10:46:17 Proposal: Partitioning Advisor for PostgreSQL
Previous Message Pavel Raiskup 2018-05-24 09:42:58 Re: Shared PostgreSQL libraries and symbol versioning