| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Subject: | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) |
| Date: | 2025-12-19 03:38:24 |
| Message-ID: | CABPTF7X1X810eT7hjRz9M69sjMv1Vui5gXQEGhqYXmkmywhHCQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
He Melanie,
Thanks for working on this.
On Wed, Dec 17, 2025 at 12:59 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Wed, Dec 3, 2025 at 6:07 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > If we're just talking about the renaming, looking at procarray.c, it
> > is full of the word "removable" because its functions were largely
> > used to examine and determine if everyone can see an xmax as committed
> > and thus if that tuple is removable from their perspective. But
> > nothing about the code that I can see means it has to be an xmax. We
> > could just as well use the functions to determine if everyone can see
> > an xmin as committed.
>
> In the attached v27, I've removed the commit that renamed functions in
> procarray.c. I've added a single wrapper GlobalVisTestXidNotRunning()
> that is used in my code where I am testing live tuples. I think you'll
> find that I've addressed all of your review comments now -- as I've
> also gotten rid of the confusing blk_known_av logic through a series
> of refactors.
>
> The one outstanding point is which commits should bump
> XLOG_PAGE_MAGIC. (also review of the reworked patches).
>
> - Melanie
I’ve done a basic review of patches 1 and 2. Here are some comments
which may be somewhat immature, as this is a fairly large change set
and I’m new to some parts of the code.
1) Potential stale old_vmbits after VM repair n v2
// Corruption check 1
if (!PageIsAllVisible(page) &&
(old_vmbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
{
visibilitymap_clear(...); // VM now cleared to 0
// but old_vmbits still holds ALL_VISIBLE
}
// ... later ...
if (!presult.all_visible)
return presult.ndeleted; // Not taken if presult.all_visible=true
new_vmbits = VISIBILITYMAP_ALL_VISIBLE; // Want to set this
if (old_vmbits == new_vmbits) // Stale old_vmbits=ALL_VISIBLE,
new_vmbits=ALL_VISIBLE
return presult.ndeleted; // issue: early return
After corruption repair clears the VM, old_vmbits is stale. The early
return can fire unexpectedly, leaving the VM cleared when it should be
re-set. Should we reset old_vmbits = 0 after the visibilitymap_clear?
2) Add Assert(BufferIsDirty(buf))
Since the patch's core claim is "buffer must be dirty before WAL
registration", an assertion encodes this invariant. Should we add:
Assert(BufferIsValid(buf));
Assert(BufferIsDirty(buf));
right before the visibilitymap_set() call?
3) Comment about "only scenario"
The comment at lines:
> "The only scenario where it is not already dirty is if the VM was removed…"
This phrasing could become misleading after future refactors. Can we
make it more direct like:
> "We must mark the heap buffer dirty before calling visibilitymap_set(), because it may WAL-log the buffer and XLogRegisterBuffer() requires it."
4) Comment clarity
Current comment:
> "Even if PD_ALL_VISIBLE is already set, we don't need to worry about unnecessarily dirtying the heap buffer, as it must be marked dirty before adding it to the WAL chain. The only scenario where it is not already dirty is if the VM was removed..."
In this test we now call MarkBufferDirty() on the heap page even when
only setting the VM, so the comments claiming “does not need to modify
the heap buffer”/“no heap page modification” might be misleading. It
might be better to say the test doesn’t need to modify heap
tuples/page contents or doesn’t need to prune/freeze.
--
Best,
Xuneng
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-12-19 03:53:16 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Previous Message | Marcos Magueta | 2025-12-19 03:25:51 | Re: WIP - xmlvalidate implementation from TODO list |