Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date: 2025-10-14 23:26:57
Message-ID: CAAKRu_YR-COJ9aGnMUQqt5yoWmUBjikqrd4jGNZYouHaXpis9g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks so much for the review! I've addressed all your feedback except
what is commented on inline below.
I've gone ahead and committed the preliminary patches that you thought
were ready to commit.

Attached v18 is what remains.

0001 - 0003: refactoring
0004 - 0006: finish eliminating XLOG_HEAP2_VISIBLE
0007 - 0009: refactoring
0010: Set VM on-access
0011: Set prune xid on insert
0012: Some refactoring for discussion

For 0001, I got feedback heap_page_prune_and_freeze() has too many
arguments, so I tried to address that. I'm interested to know if folks
like this more.

0011 still needs a bit of investigation to understand fully if
anything else in the index-killtuples test needs to be changed to make
sure we have the same coverage.

0012 is sort of WIP. I got feedback heap_page_prune_and_freeze() was
too long and should be split up into helpers. I want to know if this
split makes sense. I can pull it down the patch stack if so.

Only 0001 and 0012 are optional amongst the refactoring patches. The
others are required to make on-access VM-setting possible or viable.

On Thu, Oct 9, 2025 at 2:18 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > @@ -71,12 +84,12 @@ heap_xlog_prune_freeze(XLogReaderState *record)
> > }
> > - action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL,
> > - (xlrec.flags & XLHP_CLEANUP_LOCK) != 0,
> > - &buffer);
> > - if (action == BLK_NEEDS_REDO)
> > + if (XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL,
> > + (xlrec.flags & XLHP_CLEANUP_LOCK) != 0,
> > + &buffer) == BLK_NEEDS_REDO)
> > {
> > Page page = BufferGetPage(buffer);
> > OffsetNumber *redirected;
>
> Why move it around this way?

Because there will be an action for the visibility map
XLogReadBufferForRedoExtended(). I could have renamed it heap_action,
but it is being used only in one place, so I preferred to just cut it
to avoid any confusion.

> > Advance the page LSN
> > + * only if the record could include an FPI, since recovery skips
> > + * records <= the stamped LSN. Otherwise it might skip an earlier FPI
> > + * needed to repair a torn page.
> > + */
>
> This is confusing, should probably just reference the stuff we did in the
> !recovery case.

I fixed this and addressed all your feedback related to this before committing.

> > + if (do_prune || nplans > 0 ||
> > + ((vmflags & VISIBILITYMAP_VALID_BITS) && XLogHintBitIsNeeded()))
> > + PageSetLSN(page, lsn);
> > +
> > /*
> > * Note: we don't worry about updating the page's prunability hints.
> > * At worst this will cause an extra prune cycle to occur soon.
> > */
>
> Not your fault, but that seems odd? Why aren't we just doing the right thing?

The comment dates back to 6f10eb2. I imagine no one ever bothered to
fuss with extracting the XID. You could change
heap_page_prune_execute() to return the right value -- though that's a
bit ugly since it is used in normal operation as well as recovery.

> I wonder if the VM specific redo portion should be in a common helper? Might
> not be enough code to worry though...

I think it might be more code as a helper at this point.

> > @@ -2860,6 +2867,29 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
> > VACUUM_ERRCB_PHASE_VACUUM_HEAP, blkno,
> > InvalidOffsetNumber);
> >
> > + /*
> > + * Before marking dead items unused, check whether the page will become
> > + * all-visible once that change is applied.
>
> So the function is named _would_ but here you say will :)

I thought about it more and still feel that this function name should
contain "would". From vacuum's perspective it is "will" -- because it
knows it will remove those dead items, but from the function's
perspective it is hypothetical. I changed the comment though.

> > + if (heap_page_would_be_all_visible(vacrel, buffer,
> > + deadoffsets, num_offsets,
> > + &all_frozen, &visibility_cutoff_xid))
> > + {
> > + vmflags |= VISIBILITYMAP_ALL_VISIBLE;
> > + if (all_frozen)
> > + {
> > + vmflags |= VISIBILITYMAP_ALL_FROZEN;
> > + Assert(!TransactionIdIsValid(visibility_cutoff_xid));
> > + }
> > +
> > + /* Take the lock on the vmbuffer before entering a critical section */
> > + LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
>
> It sure would be nice if we had documented the lock order between the heap
> page and the corresponding VM page anywhere. This is just doing what we did
> before, so it's not this patch's fault, but I did get worried about it for a
> moment.

Well, the comment above the visibilitymap_set* functions says what
expectations they have for the heap page being locked.

> > +static bool
> > +heap_page_would_be_all_visible(LVRelState *vacrel, Buffer buf,
> > + OffsetNumber *deadoffsets,
> > + int ndeadoffsets,
> > + bool *all_frozen,
> > + TransactionId *visibility_cutoff_xid)
> > +{
> > Page page = BufferGetPage(buf);

> Hm, what about an assert checking that matched_dead_count == ndeadoffsets at
> the end?

I was going to put an Assert(ndeadoffsets <= matched_dead_count), but
then I started wondering if there is a way we could end up with fewer
dead items than we collected during phase I.

I had thought about if we dropped an index and then did on-access
pruning -- but we don't allow setting LP_DEAD items LP_UNUSED in
on-access pruning. So, maybe this is safe... I can do a follow-on
commit to add the assert. But I'm just not 100% sure I've thought of
all the cases where we might end up with fewer dead items.

> > During vacuum's first and third phases, we examine tuples' visibility
> > to determine if we can set the page all-visible in the visibility map.
> >
> > Previously, this check compared tuple xmins against a single XID chosen at
> > the start of vacuum (OldestXmin). We now use GlobalVisState, which also
> > enables future work to set the VM during on-access pruning, since ordinary
> > queries have access to GlobalVisState but not OldestXmin.
> >
> > This also benefits vacuum directly: GlobalVisState may advance
> > during a vacuum, allowing more pages to become considered all-visible.
> > In the rare case that it moves backward, VACUUM falls back to OldestXmin
> > to ensure we don’t attempt to freeze a dead tuple that wasn’t yet
> > prunable according to the GlobalVisState.
>
> It could, but it currently won't advance in vacuum, right?

I thought it was possible for it to advance when calling
heap_prune_satisfies_vacuum() ->
GlobalVisTestIsRemovableXid()->...GlobalVisUpdate(). This case isn't
going to be common, but some things can cause us to update it.

We have talked about explicitly updating GlobalVisState more often
during vacuums of large tables. But I was under the impression that it
was at least possible for it to advance during vacuum now.

- Melanie

Attachment Content-Type Size
v18-0001-Refactor-heap_page_prune_and_freeze-parameters-i.patch text/x-patch 12.8 KB
v18-0002-Keep-all_frozen-updated-in-heap_page_prune_and_f.patch text/x-patch 5.3 KB
v18-0003-Update-PruneState.all_-visible-frozen-earlier-in.patch text/x-patch 9.7 KB
v18-0004-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch text/x-patch 41.1 KB
v18-0005-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch text/x-patch 2.5 KB
v18-0006-Remove-XLOG_HEAP2_VISIBLE-entirely.patch text/x-patch 26.2 KB
v18-0007-Rename-GlobalVisTestIsRemovableXid-to-GlobalVisX.patch text/x-patch 8.2 KB
v18-0008-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch text/x-patch 10.5 KB
v18-0009-Unset-all_visible-sooner-if-not-freezing.patch text/x-patch 2.4 KB
v18-0010-Allow-on-access-pruning-to-set-pages-all-visible.patch text/x-patch 28.0 KB
v18-0011-Set-pd_prune_xid-on-insert.patch text/x-patch 6.7 KB
v18-0012-Split-heap_page_prune_and_freeze-into-helpers.patch text/x-patch 17.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-10-14 23:29:32 Re: pg_createsubscriber - more logging to say if there are no pubs to drop
Previous Message Jeff Davis 2025-10-14 23:26:47 Re: Remaining dependency on setlocale()