Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joel Jacobson" <joel(at)compiler(dot)org>
Cc: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date: 2025-10-29 16:48:23
Message-ID: 2759499.1761756503@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Joel Jacobson" <joel(at)compiler(dot)org> writes:
> On Wed, Oct 29, 2025, at 12:40, Heikki Linnakangas wrote:
>> That said, we could add COMMITTED/INVALID hint bits to AsyncQueueEntry,
>> similar to heap tuples, and set the hint bit instead of replacing the
>> original xid. That might be good for performance too: If the first
>> backend to call TransactionIdDidCommit() on an AsyncQueueEntry would set
>> the hint bit, that would save the effort for other listening backends.

> Interesting, sounds like a really promising approach.

I don't see any room for adding hint bits without enlarging
AsyncQueueEntry, which I'd rather not do. But we could mechanize that
without a separate hint bit by replacing the tested XID with FrozenXid
or InvalidXid, as the patch already does. We already require writing
an XID to shared memory to be atomic, so that seems okay.

However ... that won't actually work, the reason being that
asyncQueueProcessPageEntries() doesn't work directly from an SLRU page
but from a local copy. Even if it were to modify the state of that
copy, no other backend would see the effects.

The reason it's like that is stated in the comments:

* The current page must have been fetched into page_buffer from shared
* memory. (We could access the page right in shared memory, but that
* would imply holding the SLRU bank lock throughout this routine.)

The patch proposed here likewise appears to involve holding an SLRU
bank lock throughout what could be a significant number of
TransactionIdDidCommit tests. That seems like it could result in a
pretty bad "burp" in NOTIFY throughput. That problem is ameliorated
by only doing it when VACUUM is trying to advance datfrozenxid, but
still I wonder if we can't find a less concurrency-unfriendly answer.

The local-copy behavior also means that this patch isn't quite a 100%
fix. We could have a race condition like so:

1. Backend A grabs a copy of some SLRU page and begins running
asyncQueueProcessPageEntries(), but then loses the CPU.

2. Backend B completes a VACUUM, finds it can advance datfrozenxid,
marks some relevant XIDs as frozen in the notify SLRU, and truncates
clog.

3. Backend A gets control back, tries to discover the state of
some XID that's still present in its local copy of the XID page,
and fails.

Step 2 will take long enough that this isn't very plausible
timing-wise, but it's still theoretically a hole.

All of this is a problem mainly because of the presumption that
holding an SLRU bank lock for a long time is bad. I wonder how
dangerous that really is.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-10-29 16:50:13 Re: Consistently use the XLogRecPtrIsInvalid() macro
Previous Message Robert Haas 2025-10-29 16:47:23 Re: apply_scanjoin_target_to_paths and partitionwise join