Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue

From: Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Joel Jacobson <joel(at)compiler(dot)org>, Rishu Bagga <rishu(dot)postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Daniil Davydov <3danissimo(at)gmail(dot)com>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date: 2025-10-24 08:55:02
Message-ID: CAE7r3M+LfTsGDYmUqFWq1DwWyqGC4syWFHT5BZjJA2+w2bEtgw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 22, 2025 at 3:02 AM Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> On Wed, Oct 22, 2025, at 02:16, Matheus Alcantara wrote:
> >> Regarding the v8 patch, it introduces a fundamentally new way of
> >> managing notification entries (adding entries with 'committed' state
> >> and marking them 'aborted' in abort paths). This affects all use
> >> cases, not just those involving very old unconsumed notifications, and
> >> could introduce more serious bugs like PANIC or SEGV. For
> >> backpatching, I prefer targeting just the problematic behavior while
> >> leaving unrelated parts unchanged. Though Álvaro might have a
> >> different perspective on this.
> >>
> > Thanks very much for this explanation and for what you've previously
> > wrote on [1]. It's clear to me now that the v8 architecture is not a
> > good way to go.
>
> How about doing some more work in vac_update_datfrozenxid()?
>
> Pseudo-code sketch:
>
> ```
> void
> vac_update_datfrozenxid(void)
> {
>
> /* After computing newFrozenXid from all known sources... */
>
> TransactionId oldestNotifyXid = GetOldestQueuedNotifyXid();
>
> if (TransactionIdIsValid(oldestNotifyXid) &&
> TransactionIdPrecedes(oldestNotifyXid, newFrozenXid))
> {
> /*
> * The async queue has XIDs older than our proposed freeze point.
> * Attempt cleanup, then back off and let the next VACUUM benefit.
> */
>
> if (asyncQueueHasListeners())
> {
> /*
> * Wake all listening backends across *all* databases
> * that are not already at QUEUE_HEAD.
> * They'll hopefully process notifications and advance
> * their pointers, allowing the next VACUUM to freeze further.
> */
> asyncQueueWakeAllListeners();
> }
> else
> {
> /*
> * No listeners exist - discard all unread notifications.
> * The next VACUUM should succeed in advancing datfrozenxid.
> * asyncQueueAdvanceTailNoListeners() would take exclusive lock
> * on NotifyQueueLock before checking
> * QUEUE_FIRST_LISTENER == INVALID_PROC_NUMBER
> */
> asyncQueueAdvanceTailNoListeners();
> }
>
> /*
> * Back off datfrozenxid to protect the old XIDs.
> * The cleanup we just performed should allow the next VACUUM
> * to freeze further.
> */
> newFrozenXid = oldestNotifyXid;
> }
> }
> ```
>
> Maybe it wouldn't solve all problematic situations, but to me it seems
> like these measures could help many of them, or am I missing some
> crucial insight here?

I agree we need to add something like this. Looks like with v10 it's
possible for the listen/notify queue to block datfrozenxid advancing
even without extreme circumstances (without hanging listeners etc).

I see two thing we should take care of in v10:

1) Currently asyncQueueAdvanceTail is called regularly only if we have
a constant flow of notifications. We try to advance the tail every
time when the head reaches every 4th page. So if we don't have new
notifications, the tail will stay where it is forever. It means that
if we have at least 1 notification in the queue without constant flow
of notifications then GetOldestQueuedNotifyXid will constantly return
the same result which will block the advancement of datfrozenxid. So
it looks like we need something that will advance the tail in this
case.

2) Currently we wake up all listeners regularly only if we have a
constant flow of notifications (again). Even if we have a listener and
we never wake up such a listener because of the new notifications (for
example there are no new notifications in the listener's database), we
still signal such a listener sometimes as its position lags too much
from the head. But again, if we don't have new notifications, it's
possible that such a listener will never process some notification
(from another database) and tail advancement will be blocked. As a
result, datfrozenxid advancement also will be blocked. So probably we
need something that will wake up lagging listeners if they block tail
advancement.

Best regards,
Arseniy Mukhin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2025-10-24 09:29:37 Re: Improve pg_sync_replication_slots() to wait for primary to advance
Previous Message Michael Paquier 2025-10-24 08:38:37 Re: Making pg_rewind faster