Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

From: David Benjamin <davidben(at)google(dot)com>
To: daniel(at)yesql(dot)se
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions
Date: 2024-02-15 03:58:17
Message-ID: CAF8qwaBe5YYEtTfQHEE2skiTHDgc3Mgqo=vH+nsNok+UywXztg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

By the way, I'm unable to add the patch to the next commitfest due to the
cool off period for new accounts. How long is that period? I don't suppose
there's a way to avoid it?

On Mon, Feb 12, 2024 at 11:31 AM David Benjamin <davidben(at)google(dot)com> wrote:

> On Mon, Feb 12, 2024 at 9:38 AM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>
>> > On 11 Feb 2024, at 19:19, David Benjamin <davidben(at)google(dot)com> wrote:
>> > It turns out the parts that came from the OpenSSL socket BIO were a
>> no-op, and in fact PostgreSQL is relying on it being a no-op. Instead, it's
>> cleaner to just define a custom BIO the normal way, which then leaves the
>> more standard BIO_get_data mechanism usable. This also avoids the risk that
>> a future OpenSSL will add a now BIO_ctrl to the socket type, with libssl
>> calling into it, and then break some assumptions made by PostgreSQL.
>>
>> + case BIO_CTRL_FLUSH:
>> + /* libssl expects all BIOs to support BIO_flush.
>> */
>> + res = 1;
>> + break;
>>
>> Will this always be true? Returning 1 implies that we have flushed all
>> data on
>> the socket, but what if we just before called BIO_set_retry..XX()?
>>
>
> The real one is also just a no-op. :-)
> https://github.com/openssl/openssl/blob/master/crypto/bio/bss_sock.c#L215
>
> This is used in places like buffer BIO or the FILE* BIO, where BIO_write
> might accept data, but stage it into a buffer, to be flushed later. libssl
> ends up calling BIO_flush at the end of every flight, which in turn means
> that BIOs used with libssl need to support it, even if to return true
> because there's nothing to flush. (Arguably TCP sockets could have used a
> flush concept, to help control Nagle's algorithm, but for better or worse,
> that's a socket-wide TCP_NODELAY option, rather than an explicit flush
> call.)
>
> BIO_set_retry.. behaves like POSIX I/O, where a failed EWOULDBLOCK write
> is as if you never wrote to the socket at all and doesn't impact
> socket state. That is, the data hasn't been accepted yet. It's not expected
> for BIO_flush to care about the rejected write data. (Also I don't believe
> libssl will ever trigger this case.) It's confusing because unlike an
> EWOULDBLOCK errno, BIO_set_retry.. *is* itself BIO state, but that's just
> because the BIO calling convention is goofy and didn't just return the
> error out of the return value. So OpenSSL just stashes the bit on the BIO
> itself, for you to query out immediately afterwards.
>
>
>> > I've attached a patch which does that. The existing SSL tests pass with
>> it, tested on Debian stable. (Though it took me a few iterations to figure
>> out how to run the SSL tests, so it's possible I've missed something.)
>>
>> We've done a fair bit of work on making them easier to run, so I'm
>> curious if
>> you saw any room for improvements there as someone coming to them for the
>> first
>> time?
>>
>
> A lot of my time was just trying to figure out how to run the tests in
> the first place, so perhaps documentation? But I may just have been looking
> in the wrong spot and honestly didn't really know what I was doing. I can
> try to summarize what I did (from memory), and perhaps that can point to
> possible improvements?
>
> - I looked in the repository for instructions on running the tests and
> couldn't find any. At this point, I hadn't found src/test/README.
> - I found
> https://wiki.postgresql.org/wiki/Developer_FAQ#How_do_I_test_my_changes.3F,
> linked from https://www.postgresql.org/developer/
> - I ran the configure build with --enable-cassert, ran make check, tests
> passed.
> - I wrote my patch and then spent a while intentionally adding bugs to see
> if the tests would catch it (I wasn't sure whether there was ssl test
> coverage), finally concluding that I wasn't running any ssl tests
> - I looked some more and found src/test/ssl/README
> - I reconfigured with --enable-tap-tests and ran make check
> PG_TEST_EXTRA=ssl per those instructions, but the SSL tests still weren't
> running
> - I grepped for PG_TEST_EXTRA and found references in the CI config, but
> using the meson build
> - I installed meson, mimicked a few commands from the CI. That seemed to
> work.
> - I tried running only the ssl tests, looking up how you specify
> individual tests in meson, to make my compile/test cycles a bit faster, but
> they failed.
> - I noticed that the first couple "tests" were named like setup tasks, and
> guessed that the ssl tests depended on this setup to run. But by then I
> just gave up and waited out the whole test suite per run. :-)
>
> Once I got it running, it was quite smooth. I just wasn't sure how to do
> it.
>
> David
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-02-15 04:11:38 Re: Memory consumed by paths during partitionwise join planning
Previous Message shveta malik 2024-02-15 03:52:23 Re: About a recently-added message