Re: basebackup/lz4 crash

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Jeevan Ladhe <jeevanladhe(dot)os(at)gmail(dot)com>, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: basebackup/lz4 crash
Date: 2022-03-30 16:57:35
Message-ID: CA+TgmoYx3wcnvqF_+h=GF43FCdsPZ624bV52DkhGSEWOPmoKuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 30, 2022 at 10:35 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> It looks like maybe this code was copied from
> bbstreamer_lz4_compressor_finalize() which has an Assert rather than expanding
> the buffer like in bbstreamer_lz4_compressor_new()

Hmm, this isn't great. On the server side, we set up a chain of bbsink
buffers that can't be resized later. Each bbsink tells the next bbsink
how to make its buffer, but the successor bbsink has control of that
buffer and resizing it on-the-fly is not allowed. It looks like
bbstreamer_lz4_compressor_new() is mimicking that logic, but not well.
It sets the initial buffer size to
LZ4F_compressBound(streamer->base.bbs_buffer.maxlen, prefs), but
streamer->base.bbs_buffer.maxlen is not any kind of promise from the
caller about future chunk sizes. It's just whatever initStringInfo()
happens to do. My guess is that Dipesh thought that the buffer
wouldn't need to be resized because "we made it big enough already"
but that's not the case. The server knows how much data it is going to
read from disk at a time, but the client has to deal with whatever the
server sends.

I think your proposed change is OK, modulo some comments. But I think
maybe we ought to delete all the stuff related to compressed_bound
from bbstreamer_lz4_compressor_new() as well, because I don't see that
there's any point. And then I think we should also add logic similar
to what you've added here to bbstreamer_lz4_compressor_finalize(), so
that we're not making the assumption that the buffer will get enlarged
at some earlier point.

Thoughts?

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-03-30 16:58:26 Re: pgsql: Add 'basebackup_to_shell' contrib module.
Previous Message Andres Freund 2022-03-30 16:54:50 Re: pgsql: Add 'basebackup_to_shell' contrib module.