Re: Fix uninitialized PruneFreezeResult in pruneheap and vacuumlazy

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Fix uninitialized PruneFreezeResult in pruneheap and vacuumlazy
Date: 2025-12-11 22:36:50
Message-ID: 81B5A4FF-5D9E-4C1A-AA3E-7E579AE176AC@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Dec 11, 2025, at 22:59, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
>
> On Wed, Dec 10, 2025 at 11:02 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>> While reviewing Melanie's patch [1], I found this bug where presult is not initialized. Let me explain the logic.
>
> Thanks for looking closely at the code.
>
>> static int
>> lazy_scan_prune(LVRelState *vacrel,
>> presult, &prstate); <== immediately pass uninitialized presult to prune_freeze_setup
>>
>> Then in prune_freeze_setup():
>>
>> prstate->deadoffsets = (OffsetNumber *) presult->deadoffsets; <== presult->deadoffsets could be a random value
>>
>> Attached is a simple fix by just initializing presult in the first place with {0}.
>
> I don't think zero-initializing deadoffsets is needed. We don't read
> offsets from it in heap_page_prune_and_freeze() -- it's a result
> variable. We only set offsets in it (see heap_prune_record_dead()).
> And because we track exactly how many are initialized in
> prstate->lpdead_items, the caller (lazy_scan_heap() via
> lazy_scan_prune()) will only access those dead offsets which have been
> initialized. I think this is a pretty common pattern in C. We don't
> zero-initialize the other arrays of offsets in the PruneState
> (redirected, dead, etc) and, for example, lazy_scan_noprune() doesn't
> zero-initialize the deadoffsets array that it fills in.

Thanks for the explanation. I didn’t notice deadoffsets is an array, so prune_freeze_setup() only assign the array address to prstate, which doesn’t care about content stored in the array. In this case, initializing presult is not required.

>
> The reason PruneFreezeResult is passed into prune_freeze_setup() is
> that we save a pointer to the deadoffsets array in the PruneState
> instead of having a copy of the whole array (to save stack space and
> effort copying the array from PruneState into PruneFreezeResult at the
> end).
>
> Other than that, we wait to initialize PruneFreezeResult's members
> until the end of heap_page_prune_and_freeze() to make it clear that we
> are actually setting all the members. If we filled them out throughout
> the various functions and helpers, it would be less clear that we have
> filled in all the return values.

I don’t get this point. presult is a local variable defined in the caller function, filling with random values, there is no way to distinct if a field has been set or not because of random values. From this perspective, zero-out presult may make it clear that we are actually setting the members.

>
> I could add a comment to prune_freeze_setup() where we save the
> deadoffsets pointer that explains why we are doing that instead of
> just having a deadoffsets array in the PruneState. Would that help
> with the confusion?
>

That will be great.

From “clearly knowing which members are set” perspective, I still feel initializing presult = {0} is useful, at least harmless. There are only 2 places, not a big change.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-12-11 22:39:58 Re: Proposal: Add a callback data parameter to GetNamedDSMSegment
Previous Message Sami Imseih 2025-12-11 22:12:02 Re: Proposal: Add a callback data parameter to GetNamedDSMSegment