Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: yangyz <1197620467(at)qq(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart
Date: 2026-02-25 13:54:03
Message-ID: 79349F18-1A09-4E90-96FF-DB07E10CD32D@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Feb 25, 2026, at 21:10, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>
>> On 25 Feb 2026, at 13:41, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>>> On Feb 25, 2026, at 18:21, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>>>
>>>> On 25 Feb 2026, at 07:31, yangyz <1197620467(at)qq(dot)com> wrote:
>>>
>>>> 2.Performance Overhead
>>>> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized. Since these memory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an unnecessary consumption of CPU resources.
>>>
>>> When proposing a performance improvement it's important to provide some level
>>> of benchmarks to show the improvement. Is removing this memset noticeable?
>>
>> I don’t think this patch is about performance. Although removing the memset might save a few CPU cycles, the real benefit seems to be cleanup and consistency. The memset appears unnecessary, and similar functions don’t use it, so I think this change mainly improves maintainability.
>
> I would argue the opposite, clearing a buffer before passing it to an external
> library function writing to it seems the right thing to do unless it can be
> proven to regress performance too much. Also, "appears unnecessary" doesn't
> instill enough confidence to perform a change IMO.
>
> --
> Daniel Gustafsson

As I pointed out earlier, ReadDataFromArchiveLZ4() has a very similar loop that doesn’t zero out the output buffer:
```
while (readp < readend)
{
size_t out_size = DEFAULT_IO_BUFFER_SIZE;
size_t read_size = readend - readp;

status = LZ4F_decompress(ctx, outbuf, &out_size,
readp, &read_size, &dec_opt);
if (LZ4F_isError(status))
pg_fatal("could not decompress: %s",
LZ4F_getErrorName(status));

ahwrite(outbuf, 1, out_size, AH);
readp += read_size;
}
```

Do you think we should add a memset there? There are a couple of more callers of LZ4F_decompress that don’t zero out the output buffer.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2026-02-25 13:54:09 Re: Adding REPACK [concurrently]
Previous Message Ashutosh Sharma 2026-02-25 13:51:17 Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication