Re: libpq's multi-threaded SSL callback handling is busted

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: libpq's multi-threaded SSL callback handling is busted
Date: 2015-02-12 09:46:01
Message-ID: 20150212094601.GW21017@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-12 09:31:27 +0100, Jan Urbański wrote:
>
> Andres Freund writes:
>
> > On 2015-02-09 18:17:14 +0100, Jan Urbański wrote:
> >> First of all, the current behaviour is crazy. We're setting and unsetting the
> >> locking callback every time a connection is made/closed, which is not how
> >> OpenSSL is supposed to be used. The *application* using libpq should set a
> >> callback before it starts threads, it's no business of the library's.
> >
> > I don't buy this argument at all. Delivering a libpq that's not
> > threadsafe in some circumstances is a really bad idea.
>
> I knew this would be a hard sell :( What I know is that the current situation
> is not good and leaving it as-is causes real grief.

It certainly causes less grief than just breaking working
applications. While really shitty the current situation works in a large
number of scenarios. It's not that common to have several users of
openssl in an application *and* unload libpq.

> > I fail to see why it's any better to do it in the VM. That relies on
> > either always loading the VM's openssl module, even if you don't
> > actually need it because the library you use deals with openssl. It
> > prevents other implementations of openssl in the VM.
>
> The callbacks are a thing that's owned by the application, so libraries have no
> business in setting them up. The way OpenSSL's API is specified (very
> unfortunately IMHO) is that before you use OpenSSL from threads, you need to
> set the callbacks. If you don't control your application's startup (like when
> you're a script executed through a VM), you assume the VM took care of it at
> startup. If you're a library, you assume the user took care of it, as they
> should.

That's a bogus argument. The VM cannot setup up every library in the
world in a correct way. It'd be obviously be completely insane to load
the ssl module in every library just because some part of some random
application might need it. In fact, the ssl library for python does
pretty much the same thing as libpq does (although it's slightly more
careful). It's not the VM at all.

> > What I think we should do is the following:
> >
> > * Emphasize the fact that it's important to use PQinitSSL(false) if you
> > did things yourself.
>
> PQinitOpenSSL(true, false) if anything. The reason that function got introduced
> is precisely because PQinitSSL(false) isn't exactly right, see
> http://www.postgresql.org/message-id/49820D7D.7030902@esilo.com

Well, that really depends on what the application is actually using...

> That doesn't solve the problem of the Python deadlock, where you're not at
> leisure to call a C function at the beginning of your module.

We could just never unload the hooks...

> > * If there's already callbacks set: Remember that fact and don't
> > overwrite. In the next major version: warn.
>
> So yeah, that was my initial approach - check if callbacks are set, don't do
> the dance if they are. It felt like a crutch, though, and racy at that. There's
> no atomic way to test-and-set those callbacks. The window for racyness is
> small, though.

If you do that check during library initialization instead of every
connection it shouldn't be racy - if that part is run in a multithreaded
fashion you're doing something crazy.

> > * Assert or something if the callback when unregistering isn't the same
> > as it was when registering. That's pretty much guaranteed to cause
> > issues.
>
> So let me try another proposal and see if it doesn't set alarm bells off.
>
> * When opening a connection, if there's a callback set, don't overwrite it
> (small race window).

> * When closing a connection, if the callback set is not
> pq_lockingcallback/pq_threadidcallback, don't NULL it (small race window)

If we do the tests once during initialization there shouldn't be a
race....

> Asserts in frontend code are impossible, right? There's no way to warn, either.

You can write to stderr...

> That would be a ~8 line patch. Does it feel back-patcheable?

I think we first need to find out what we all can agree on before
deciding about that.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2015-02-12 10:40:25 Re: Index-only scans for GiST.
Previous Message Naoya Anzai 2015-02-12 08:44:46 Re: Table-level log_autovacuum_min_duration