Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
Cc: Daniil Davydov <3danissimo(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date: 2025-08-19 17:40:50
Message-ID: CAD21AoAYjnGZAyAfim9QyajXeZJb_gQoNGb501k4dQP8gRf3DA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 18, 2025 at 5:14 PM Matheus Alcantara
<matheusssilv97(at)gmail(dot)com> wrote:
>
> On Wed Aug 13, 2025 at 4:29 PM -03, Daniil Davydov wrote:
> > Hi,
> >
> > On Mon, Aug 11, 2025 at 8:41 PM Matheus Alcantara
> > <matheusssilv97(at)gmail(dot)com> wrote:
> >>
> >> On Wed Aug 6, 2025 at 7:44 AM -03, Álvaro Herrera wrote:
> >> >> My questions:
> >> >>
> >> >> 1. Is it acceptable to drop notifications from the async queue if
> >> >> there are no active listeners? There might still be notifications that
> >> >> haven’t been read by any previous listener.
> >> >
> >> > I'm somewhat wary of this idea -- could these inactive listeners become
> >> > active later and expect to be able to read their notifies?
> >> >
> >> I'm bit worry about this too.
> >
> > What exactly do we mean by "active listener"? According to the source code,
> > the active listener (as far as I understand) is the one who listens to at least
> > one channel. If we have no active listeners in the database, the new listener
> > will set its pointer to the tail of the async queue. Thus, messages with old
> > xid will not be touched by anybody. I don't see any point in dropping them
> > in this case.
> >
> I think that this definition is correct, but IIUC the tail can still
> have notifications with xid's that were already truncated by vacuum
> freeze. When the LISTEN is executed, we first loop through the
> notification queue to try to advance the queue pointers and we can
> eventually iterate over a notification that was added on the queue
> without any listener but it has a xid that is already truncated by vacuum
> freeze, so in this case it will fail to get the transaction status.

When a process first executes the LISTEN command, it iterates through
the notification queue, but it seems only to advance its queue pointer
because it doesn't add the interested channels to its list yet and it
isn't interested in the notifications queued before it registered as a
listener. I'm wondering if we could optimize this by allowing the
queue pointer to fast-forward without checking transaction status. If
feasible, this might resolve the reported issue.

However, I have a more fundamental concern regarding the LISTEN/NOTIFY
implementation. Since vacuum doesn't consider the XIDs of notification
entries, there might be a critical issue with the truncation of clog
entries that LISTEN/NOTIFY still requires. As I understand it,
notification queue entries aren't ordered by XID, and it's possible
for a notification with an older XID to be positioned at the queue's
head. If vacuum freeze then truncates the corresponding clogs,
listeners attempting to retrieve this notification would fail to
obtain the transaction status. To address this, we likely need to
either implement Álvaro's suggestion[1] to make vacuum aware of the
oldest XID in the notification queue, or develop a mechanism to
remove/freeze XIDs of the notification entries.

Regards,

[1] https://www.postgresql.org/message-id/202508061044.ptcyt7aqsaaa%40alvherre.pgsql

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-08-19 18:00:04 Re: RFC: extensible planner state
Previous Message Nathan Bossart 2025-08-19 17:38:49 Re: Reduce "Var IS [NOT] NULL" quals during constant folding