Re: Why clearing the VM doesn't require registering vm buffer in wal record

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Why clearing the VM doesn't require registering vm buffer in wal record
Date: 2026-06-24 00:01:46
Message-ID: CAAKRu_asU2aX5tsZLY6JBXUVKufp1jsFwV0fmkDpk-jj4iQmRQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 7, 2026 at 8:54 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 5/6/26 23:46, Melanie Plageman wrote:
>
> > Based on an LLM-assisted self-review of this code, I think there is a
> > pre-existing bug in heap_lock_updated_tuple_rec() where we don't reset
> > cleared_all_frozen and may have a stale value if we iterate more than
> > once (new tuple versions) or use the l4 goto label (retries for the
> > same tuple version). This could mean we unnecessarily clear all-frozen
> > on the standby (and unnecessarily register the VM buffer after my
> > patch). I couldn't come up with a repro though, so I didn't include a
> > patch to fix it, as I wanted to confirm that someone else also thought
> > this is a bug.
>
> Hmm, would this be a correctness or efficiency issue?
>
> I'm not very familiar with this code, so I can't say if there is an
> issue or not. Could you explain the sequence of steps leading to an
> issue? Maybe that assumes something that's not quite possible, or it
> might help constructing a reproducer.
>
> I don't quite see how could it be related to the "l4" goto label, when
> all the "goto l4" cases are before setting "cleared_all_frozen = true"
> (in any given iteration).

You're right that it couldn't happen with goto l4. However, it is
possible in the loop when walking an update chain, it can set
cleared_all_frozen to true, clear the page's all frozen bits and then
later, on the next iteration, clear it again (even though it's already
clear). It's only an efficiency issue in master, actually (as you
suggest). In my patch it would be a serious bug because it would try
to register the VM page and set an LSN on that VM page when it isn't
locked or marked dirty.

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2026-06-24 00:21:51 Re: Add a hook for handling logical decoding messages on subscribers.
Previous Message Baji Shaik 2026-06-23 23:57:24 Re: Protect against timestamp underflow in uuidv7(interval)