Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Benjamin <davidben(at)google(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions
Date: 2024-02-15 21:17:41
Message-ID: 20240215211741.qrtdvoumo7ff26hz@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-02-11 13:19:00 -0500, David Benjamin wrote:
> I've attached a patch for the master branch to fix up the custom BIOs used
> by PostgreSQL, in light of the issues with the OpenSSL update recently.
> While c82207a548db47623a2bfa2447babdaa630302b9 (switching from BIO_get_data
> to BIO_get_app_data) resolved the immediate conflict, I don't think it
> addressed the root cause, which is that PostgreSQL was mixing up two BIO
> implementations, each of which expected to be driving the BIO.

Yea, that's certainly not nice - and I think we've been somewhat lucky it
hasn't caused more issues. There's some nasty possibilities, e.g. sock_ctrl()
partially enabling ktls without our initialization having called
ktls_enable(). Right now that just means ktls is unusable, but it's not hard
to imagine accidentally ending up sending unencrypted data.

I've in the past looked into not using a custom BIO [1], but I have my doubts
about that being a good idea. I think medium term we want to be able to do
network IO asynchronously, which seems quite hard to do when using openssl's
socket BIO.

> Once we've done that, we're free to use BIO_set_data. While BIO_set_app_data
> works fine, I've reverted back to BIO_set_data because it's more commonly used.
> app_data depends on OpenSSL's "ex_data" mechanism, which is a tad heavier under
> the hood.

At first I was a bit wary of that, because it requires us to bring back the
fallback implementation. But you're right, it's noticeably heavier than
BIO_get_data(), and we do call BIO_get_app_data() fairly frequently.

> That leaves ctrl. ctrl is a bunch of operations (it's ioctl). The only
> operation that matters is BIO_CTRL_FLUSH, which is implemented as a no-op. All
> other operations are unused. It's once again good that they're unused because
> otherwise OpenSSL might mess with postgres's socket, break the assumptions
> around interrupt handling, etc.

How did you determine that only FLUSH is required? I didn't even really find
documentation about what the intended semantics actually are.

E.g. should we implement BIO_CTRL_EOF? Sure, it wasn't really supported so
far, because we never set it, but is that right? What about
BIO_CTRL_GET_CLOSE/BIO_CTRL_SET_CLOSE?

Another issue is that 0 doesn't actually seem like the universal error return
- e.g. BIO_C_GET_FD seems to return -1, because 0 is a valid fd.

As of your patch the bio doesn't actually have an FD anymore, should it still
set BIO_TYPE_DESCRIPTOR?

> +static long
> +my_sock_ctrl(BIO *h, int cmd, long num, void *ptr)
> +{
> + long res = 0;
> +
> + switch (cmd)
> + {
> + case BIO_CTRL_FLUSH:
> + /* libssl expects all BIOs to support BIO_flush. */
> + res = 1;
> + break;
> + }
> +
> + return res;
> +}

I'd move the res = 0 into a default: block. That way the compiler can warn if
some case doesn't set it in all branches.

> static BIO_METHOD *
> my_BIO_s_socket(void)
> {

Wonder if we should rename this. It's pretty odd that we still call it's not
really related to s_socket anymore, and doesn't even really implement the same
interface (e.g. get_fd doesn't work anymore). Similarly, my_SSL_set_fd()
doesn't actually call set_fd() anymore, which sure seems odd.

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/20210715021747.h2hih7mw56ivyntt%40alap3.anarazel.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Maiquel Grassi 2024-02-15 22:16:33 RE: Psql meta-command conninfo+
Previous Message Noah Misch 2024-02-15 20:48:16 Re: 035_standby_logical_decoding unbounded hang