Re: Fix some inconsistencies with open-coded visibilitymap_set() callers

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Fix some inconsistencies with open-coded visibilitymap_set() callers
Date: 2025-07-10 13:57:05
Message-ID: CAAKRu_aR3GPy7ALsCr_WYAwsJkYoeV49+rLY=Lg2vW-Mxh6-pw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 7, 2025 at 11:38 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Jul 2, 2025 at 7:33 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
>
> > One thing we could do is check if the heap buffer is dirty before
> > setting the LSN in visibilitymap_set():
>
> I don't think this is the way. visibilitymap_set() shouldn't guess
> what the caller has done or will do; the caller should make it clear
> explicitly.

A direct translation of this would be to add a boolean parameter to
visibilitymap_set() like "heap_page_modified" or "set_heap_lsn". Is
that along the lines you were thinking?

> > For all the cases where we modify the heap and VM page not in the same
> > critical section, we could error out after making the heap page
> > changes before setting the VM. But because this would just lead to
> > PD_ALL_VISIBLE being set and the VM bit not being set, which is
> > technically fine.
> >
> > It seems like it would be better to make all of the changes in the
> > same critical section, and, in some of the cases, it wouldn't be hard
> > to do so. But, since I cannot claim a correctness issue, we can leave
> > it the way it is.
>
> If it's useful to combine things as a precursor to future work, that
> may be fair enough, but otherwise I don't see the point. It's not
> "technically fine;" it's just straight-up fine. It's not desirable, in
> the sense that we might end up having to do extra page reads later to
> correct the situation, but crashes won't intervene between those two
> critical sections often enough to matter. Unifying those critical
> sections won't change anything about what states code elsewhere in the
> tree must be prepared to see on disk.

I think one argument for having it in the same critical section is the
defensive coding perspective. Our rule is that you make changes to the
heap page, mark it dirty, emit WAL, and then stamp it with that LSN
all in the same critical section.

In the case of PD_ALL_VISIBLE, it is okay to break this rule because
it is okay to lose the PD_ALL_VISIBLE hint as long as the VM bit
doesn't get set. But, future programmers could see this and make other
modifications to the heap page in the same area we set PD_ALL_VISIBLE
(the one outside of a critical section).

Anyway, I'm now convinced that the way forward for this particular
issue is to represent the VM changes in the same WAL record that has
the heap page changes, so I won't further belabor the issue.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2025-07-10 14:06:44 Re: Improving and extending int128.h to more of numeric.c
Previous Message jian he 2025-07-10 13:34:12 Re: SQL:2023 JSON simplified accessor support