Re: Shared PostgreSQL libraries and ABI versioning

From: Pavel Raiskup <praiskup(at)redhat(dot)com>
To: Christoph Berg <myon(at)debian(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Shared PostgreSQL libraries and ABI versioning
Date: 2018-05-24 12:01:00
Message-ID: 1916023.EZebk5UjTt@nb.usersys.redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Christoph,

On Thursday, May 24, 2018 12:08:16 PM CEST Christoph Berg wrote:
> Re: Pavel Raiskup 2018-05-24 <101829257(dot)NN0XsvVvxK(at)nb(dot)usersys(dot)redhat(dot)com>
> > > Interesting, thanks. How this is implemented? What you mean by "newer
> > > library" -- new soname, or just a newer package build (without any other
> > > change) e.g.?
> >
> > I've already done some observation how libraries are handled by Debian :-)
> > sorry for my ignorance. If I got it right, if some packages provide
> > some SONAME then those must also provide the *.shlibs and *.symbols files
> > (in repo metadadata, apt-cache `ls -1 /var/lib/dpkg/info/libpq*.s*`) and
> > that info is then used to generate safe dependencies for other packages
> > (I mean the 'Depends' field in e.g. `apt-cache show python-psycopg2`).
>
> The whole system relies on upstream getting the SONAME right by not
> breaking the ABI, i.e. symbols must only be added, never removed, and
> never changed in semantics. For a given program with a set of symbols
> used, dpkg-shlibdeps then reads the *.symbols file to determine the
> minimum package version that this set of symbols is compatible with
> when linking against a shared object.
> (*.shlibs is the predecessor of the *.symbols system which is inferior
> because it only tracks a global minimum compatible version, instead of
> looking at each symbol individually.)
>
> > The thing is that those metadata files can be either generated
> > automatically (by dpkg-gensymbols at package build-time?) or have to be
> > maintained manually, which is the case of PostgreSQL packaging, see e.g.
> > change for added symbol in v10:
> > https://salsa.debian.org/postgresql/postgresql/commit/c3eba5c8d04177f81a7b4a043302a209f3d2d2e7
>
> The *.symbols files maintain themselves automatically via
> dpkg-gensymbols at build time, but it makes sense to edit the result
> for more accurate (or nicer) information. For example in the change
> your linking to, dpkg-gensymbols might have used
> "10~beta1-1" as minimum version, but it's better to
> change that to 10~~ because it's shorter, and because all version 10
> packages feature that symbol, and beta1 might just have been the first
> package version built.

Thanks for that sum-up!

> > So basically, Debian packagers seem take care of the symbol versioning
> > downstream, and they could have this solved automatically if upstream
> > tarball allowed the symbol versioning (so they could leverage the
> > `dpkg-gensymbols` thing, instead of the manual work). Can anyone confirm?
>
> No. Versioned symbols in shared objects have the advantage that they
> allow upstream to incompatibly change symbols without bumping the
> SONAME (provided the old version is still shipped),

Well, it's possible benefit, yes. But changing symbols without bumping
SONAME would be really Linux only, not very easy upstream change and it's
not what I ask for. At least as long as we claim libpq has a backward
compatible API/ABI, and if some incompatible change happens it is generally
acceptable to bump the soname version.

The semantic in versioning I'd like to have is the "Solaris OS" way, if
you read Ulrich's dsohowto.pdf, chapter "3.3 ABI Versioning". In a sense
that newly added symbol means the API get's newer version. By looking at
the ELF, you can see what ABI version you need.

Also from upstream POV, symbol versions bring the benefit that once we do the
SONAME bump one day, and the server process gets two versions of the libpq
library loaded accidentally (through transitive dependencies, e.g. through
PostgreSQL server modules) - without symbol versions there's a danger that
plugins will be executed against incompatible ABIs, it might lead to disaster.

> but they don't remove the need to declare a dependency on the *package*
> version where this symbol was introduced. That is not possible to
> extract from the upstream symbol version information, it needs to be
> handled on the packaging side.

But if added symbol means new ABI version, then you can bake the ABI
version into Depends (Requires) and you are sure that if you have package
which provides this ABI, you are fine. The "package name" or "package
version" providing the "library ABI" is then an orthogonal thing you don't
have to care about.

I started speaking about symbol versioning, but it was a bad idea - I
should have start with ABI/API versioning. /me is changing subject.

> > It's similar with RPM, except that the tooling is seemingly less powerful;
> > we have to have the symbol versions baked into *.so file, otherwise it
> > simply doesn't work. It has some pros, :-) it motivates us more to solve
> > it upstream, if possible. But yeah, since we'll face the same problem in
> > Fedora/RHEL/CentOS -- we'll have to have the problem solved somehow
> > (at least downstream) ...
>
> How does RPM solve the "depend on this package version" problem? By
> declaring "Provides: PQencryptPasswordConn(at)Base" in the .so's package
> for each symbol?

We automatically generate Requires:/Provides like
'lib<NAME>.so<VERSION>(ABI_VERSION)', e.g.:

$ rpm -q --provides xz-libs | grep so.5
liblzma.so.5()(64bit)
liblzma.so.5(XZ_5.0)(64bit)
liblzma.so.5(XZ_5.2)(64bit)
liblzma.so.5
liblzma.so.5(XZ_5.0)
liblzma.so.5(XZ_5.2)

> > > I don't really want to push it upstream; I'm just saying that we consider
> > > this to be important enough to go downstream-fix only way. At the same
> > > time, I'm convinced having it upstream is almost trivial change and worth
> > > having... So I try to offer a help.
>
> If you do this downstream-only, it might create a giant maintenance
> burden that you have to carry on forever, I'd think.

Yes. It's not for free, but still an acceptable/necessary thing :-(. I'd
really love to deal with this upstream first.

Pavel

> Christoph
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message ramsiddu007 2018-05-24 12:57:14 Function Overloading
Previous Message Yuzuko Hosoya 2018-05-24 10:46:17 Proposal: Partitioning Advisor for PostgreSQL