Re: Fw: Re: heap_force_common in contrib/pg_surgery/heap_surgery.c has an off by one stack buffer overflow

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: surya poondla <suryapoondla4(at)gmail(dot)com>, "violin0613(at)tju(dot)edu(dot)cn" <violin0613(at)tju(dot)edu(dot)cn>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Fw: Re: heap_force_common in contrib/pg_surgery/heap_surgery.c has an off by one stack buffer overflow
Date: 2026-06-04 09:12:23
Message-ID: CAE9k0P=sDshMaBDZzEfNPVo62w7PmnecKhtXX-Zn=AzeQ3k5kA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Michael,

On Thu, Jun 4, 2026 at 1:01 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Jun 03, 2026 at 03:31:27PM -0700, surya poondla wrote:
> > Thank you for reporting the issue, I am able to reproduce it on master.
> > The include_this_tid[] array is sized MaxHeapTuplesPerPage but indexed
> > using 1-based OffsetNumber,
> > so the largest legal offset (MaxHeapTuplesPerPage itself) lands one slot
> > past the end.
>
> - bool include_this_tid[MaxHeapTuplesPerPage];
> + /* Sized +1 because OffsetNumbers are 1-based and can reach MaxHeapTuplesPerPage. */
> + bool include_this_tid[MaxHeapTuplesPerPage + 1];
>
> The offset number begins at 1. Hence, instead of making this array
> larger by one, you could keep it at the same size and adjust the array
> index to use (offno - 1) instead.

I think Surya's approach is worth considering here. Making the helper
array 1-based aligns it naturally with PostgreSQL's convention, where
page offsets are 1-based, and that consistency has real readability
benefits.

With a 1-based helper array:

bool include_this_tid[MaxHeapTuplesPerPage + 1];

we can write:

include_this_tid[offno] = true;

if (!include_this_tid[curoff])
continue;

i.e. we can simply "mark this offset number" and "check whether this
offset number is included" - no mental translation required.

With a zero-based helper array:

bool include_this_tid[MaxHeapTuplesPerPage];

every access has to do a conversion:

include_this_tid[offno - 1] = true;

if (!include_this_tid[curoff - 1])
continue;

This works correctly, but it places an ongoing burden on anyone
reading or modifying the code - they need to keep in mind that page
offsets are 1-based, that this particular array is 0-based, that the
subtraction must be applied consistently, that it should be skipped
for InvalidOffsetNumber, and that the two index spaces should not be
inadvertently mixed in future edits.

These are admittedly small risks, but they are real ones. Keeping the
array 1-based eliminates that entire class of potential confusion and
makes the code easier to maintain going forward. I'd lean toward
Surya's approach for that reason.

--
With Regards,
Ashutosh Sharma.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Ajin Cherian 2026-06-04 11:37:30 Re: BUG #19360: Bug Report: Logical Replication initial sync fails with "conflict=update_origin_differs" PG12 toPG18
Previous Message PG Bug reporting form 2026-06-04 07:37:13 BUG #19507: Auto-named partition table constraint conflicts