Re: pg_combinebackup: incorrect size of VM fork after combine

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Oleg Tkachenko <oatkachenko(at)gmail(dot)com>
Subject: Re: pg_combinebackup: incorrect size of VM fork after combine
Date: 2026-03-09 06:19:26
Message-ID: CAAJ_b96BWWN-OpLUx4PUGH9hNLEnR=U=WHOOAEkfoO1XqGRETg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 7, 2026 at 12:25 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Mar 3, 2026 at 11:50 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > I think the fix will be to correct the wal summary entry that records
> > an incorrect truncation limit for the VM fork. Attached are the
> > patches: 0001 is a refactoring patch that moves the necessary macro
> > definitions from visibilitymap.c to visibilitymap.h to correctly
> > calculate the VM fork limit recorded in the wal summary file, and 0002
> > provides the actual fix.
>
> I don't like exposing those macros to the rest of the system, because
> they're not named very generically.
>
> Also, I don't think that using HEAPBLK_TO_MAPBLOCK() directly is
> correct. That macro returns the visibility map page that contains the
> VM bit for the indicated heap block number, but that's not what we
> need here. For example, if we truncate the heap to a length of 1,
> HEAPBLK_TO_MAPBLOCK() will return 0, but if we set the limit block for
> the VM to zero, that means the VM was truncated away entirely, which
> in this scenario wouldn't be the case. So, instead of computing too
> large a value for the VM's limit block, I think your patch would cause
> us to compute too small a value for the VM's limit block.
>
> The question we need to answer here is: if we entire remove all heap
> blocks >= N, then for what value of M do we remove all visibility map
> blocks >= M? The answer is that if the the N (the heap block limit) is
> a multiple of HEAPBLOCKS_PER_PAGE, then it's just
> N/HEAPBLOCKS_PER_PAGE. Otherwise, it's one more, because we need an
> extra VM page as soon as it's necessary to use at least one bit on
> that page. So, basically, we need to divide by HEAPBLOCKS_PER_PAGE and
> round up.
>
> So here's my attempt at a patch. Instead of exposing
> HEAPBLK_TO_MAPBLOCK() etc., I added a new helper function,
> visibilitymap_truncation_length. It's just a wrapper around a new
> macro HEAPBLK_TO_MAPBLOCK_LIMIT(), but I think keeping the exposed
> symbols having visibilitymap in the name is a good idea. Also, I added
> a test case, and I have verified that this test case passes with the
> fix and fails without it.
>

Understood, the patch looks good to me, thanks.

Regards,
Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2026-03-09 06:23:56 Re: [19] CREATE SUBSCRIPTION ... SERVER
Previous Message Shreeya Sharma 2026-03-09 06:11:45 Re: POC: PLpgSQL FOREACH IN JSON ARRAY