Re: refactoring basebackup.c

From: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: refactoring basebackup.c
Date: 2021-10-14 17:20:55
Message-ID: CAOgcT0OarU7csuf6Obgmg96DdjpGKUSETx5d6ss3eJx7VD+bdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks, Robert for reviewing the patch.

On Tue, Oct 12, 2021 at 11:09 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

This is the only place where CHUNK_SIZE gets used, and I don't think I
> see any point to it. I think the 5th argument to LZ4F_compressUpdate
> could just be avail_in. And as soon as you do that then I think
> bbsink_lz4_archive_contents() no longer needs to be a loop.
>

Agree. Removed the CHUNK_SIZE and the loop.

>
> + /* First of all write the frame header to destination buffer. */
> + headerSize = LZ4F_compressBegin(mysink->ctx,
> + mysink->base.bbs_next->bbs_buffer,
> + mysink->base.bbs_next->bbs_buffer_length,
> + &mysink->prefs);
>
> + compressedSize = LZ4F_compressEnd(mysink->ctx,
> + mysink->base.bbs_next->bbs_buffer + mysink->bytes_written,
> + mysink->base.bbs_next->bbs_buffer_length - mysink->bytes_written,
> + NULL);
>
> I think there's some issue with these two chunks of code. What happens
> if one of these functions wants to write more data than will fit in
> the output buffer? It seems like either there needs to be some code
> someplace that ensures adequate space in the output buffer at the time
> of these calls, or else there needs to be a retry loop that writes as
> much of the data as possible, flushes the output buffer, and then
> loops to generate more output data. But there's clearly no retry loop
> here, and I don't see any code that guarantees that the output buffer
> has to be large enough (and in the case of LZ4F_compressEnd, have
> enough remaining space) either. In other words, all the same concerns
> that apply to LZ4F_compressUpdate() also apply here ... but in
> LZ4F_compressUpdate() you seem to BOTH have a retry loop and ALSO code
> to make sure that the buffer is certain to be large enough (which is
> more than you need, you only need one of those) and here you seem to
> have NEITHER of those things (which is not enough, you need one or the
> other).
>

Fair enough. I have made the change in the bbsink_lz4_begin_backup() to
make sure we reserve enough extra bytes for the header and the footer those
are written by LZ4F_compressBegin() and LZ4F_compressEnd() respectively.
The LZ4F_compressBound() when passed the input size as "0", would give
the upper bound for output buffer needed by the LZ4F_compressEnd().

How about instead using memset() to zero the whole thing and then
> omitting the zero initializations? That seems like it would be less
> fragile, if the upstream structure definition ever changes.
>

Made this change.

Please review the patch, and let me know your comments.

Regards,
Jeevan Ladhe

Attachment Content-Type Size
lz4_compress_v5.patch application/octet-stream 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-10-14 17:24:12 Re: [RFC] building postgres with meson
Previous Message Isaac Morland 2021-10-14 16:44:37 Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?