| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
| Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Fix uninitialized PruneFreezeResult in pruneheap and vacuumlazy |
| Date: | 2025-12-11 22:36:50 |
| Message-ID: | 81B5A4FF-5D9E-4C1A-AA3E-7E579AE176AC@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Dec 11, 2025, at 22:59, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
>
> On Wed, Dec 10, 2025 at 11:02 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>> While reviewing Melanie's patch [1], I found this bug where presult is not initialized. Let me explain the logic.
>
> Thanks for looking closely at the code.
>
>> static int
>> lazy_scan_prune(LVRelState *vacrel,
>> presult, &prstate); <== immediately pass uninitialized presult to prune_freeze_setup
>>
>> Then in prune_freeze_setup():
>>
>> prstate->deadoffsets = (OffsetNumber *) presult->deadoffsets; <== presult->deadoffsets could be a random value
>>
>> Attached is a simple fix by just initializing presult in the first place with {0}.
>
> I don't think zero-initializing deadoffsets is needed. We don't read
> offsets from it in heap_page_prune_and_freeze() -- it's a result
> variable. We only set offsets in it (see heap_prune_record_dead()).
> And because we track exactly how many are initialized in
> prstate->lpdead_items, the caller (lazy_scan_heap() via
> lazy_scan_prune()) will only access those dead offsets which have been
> initialized. I think this is a pretty common pattern in C. We don't
> zero-initialize the other arrays of offsets in the PruneState
> (redirected, dead, etc) and, for example, lazy_scan_noprune() doesn't
> zero-initialize the deadoffsets array that it fills in.
Thanks for the explanation. I didn’t notice deadoffsets is an array, so prune_freeze_setup() only assign the array address to prstate, which doesn’t care about content stored in the array. In this case, initializing presult is not required.
>
> The reason PruneFreezeResult is passed into prune_freeze_setup() is
> that we save a pointer to the deadoffsets array in the PruneState
> instead of having a copy of the whole array (to save stack space and
> effort copying the array from PruneState into PruneFreezeResult at the
> end).
>
> Other than that, we wait to initialize PruneFreezeResult's members
> until the end of heap_page_prune_and_freeze() to make it clear that we
> are actually setting all the members. If we filled them out throughout
> the various functions and helpers, it would be less clear that we have
> filled in all the return values.
I don’t get this point. presult is a local variable defined in the caller function, filling with random values, there is no way to distinct if a field has been set or not because of random values. From this perspective, zero-out presult may make it clear that we are actually setting the members.
>
> I could add a comment to prune_freeze_setup() where we save the
> deadoffsets pointer that explains why we are doing that instead of
> just having a deadoffsets array in the PruneState. Would that help
> with the confusion?
>
That will be great.
From “clearly knowing which members are set” perspective, I still feel initializing presult = {0} is useful, at least harmless. There are only 2 places, not a big change.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2025-12-11 22:39:58 | Re: Proposal: Add a callback data parameter to GetNamedDSMSegment |
| Previous Message | Sami Imseih | 2025-12-11 22:12:02 | Re: Proposal: Add a callback data parameter to GetNamedDSMSegment |