Re: refactoring basebackup.c

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
Cc: tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: refactoring basebackup.c
Date: 2022-01-18 21:27:18
Message-ID: CA+TgmoZ5_j+ZLf66_EioReGXvYTCkgQZju9SDq8yg_QbY_1vTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 18, 2022 at 9:43 AM Jeevan Ladhe
<jeevan(dot)ladhe(at)enterprisedb(dot)com> wrote:
> The patch surely needs some grooming, but I am expecting some initial
> review, specially in the area where we are trying to close the zstd stream
> in bbsink_zstd_end_archive(). We need to tell the zstd library to end the
> compression by calling ZSTD_compressStream2() thereby sending a
> ZSTD_e_end flag. But, this also needs some input string, which per
> example[1] line # 686, I have taken as an empty ZSTD_inBuffer.

As far as I can see, this is correct. I found
https://zstd.docsforge.com/dev/api-documentation/#streaming-compression-howto
which seems to endorse what you've done here.

One (minor) thing that I notice is that, the way you've written the
loop in bbsink_zstd_end_archive(), I think it will typically call
bbsink_archive_contents() twice. It will flush whatever is already
present in the next sink's buffer as a result of the previous calls to
bbsink_zstd_archive_contents(), and then it will call
ZSTD_compressStream2() which will partially refill the buffer you just
emptied, and then there will be nothing left in the internal buffer,
so it will call bbsink_archive_contents() again. But ... the initial
flush may not have been necessary. It could be that there was enough
space already in the output buffer for the ZSTD_compressStream2() call
to succeed without a prior flush. So maybe:

do
{
yet_to_flush = ZSTD_compressStream2(..., ZSTD_e_end);
check ZSTD_isError here;
if (mysink->zstd_outBuf.pos > 0)
bbsink_archive_contents();
} while (yet_to_flush > 0);

I believe this might be very slightly more efficient.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-01-18 21:32:33 Re: Add last commit LSN to pg_last_committed_xact()
Previous Message Andrew Dunstan 2022-01-18 21:23:52 Re: SQL/JSON: JSON_TABLE