Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue

From: Daniil Davydov <3danissimo(at)gmail(dot)com>
To: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date: 2025-08-13 19:29:29
Message-ID: CAJDiXgg1ArRB1-6wLtXfVVnQ38P9Y+CDfEc9M2TXiOf_4kfBMg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Aug 11, 2025 at 8:41 PM Matheus Alcantara
<matheusssilv97(at)gmail(dot)com> wrote:
>
> On Wed Aug 6, 2025 at 7:44 AM -03, Álvaro Herrera wrote:
> >> My questions:
> >>
> >> 1. Is it acceptable to drop notifications from the async queue if
> >> there are no active listeners? There might still be notifications that
> >> haven’t been read by any previous listener.
> >
> > I'm somewhat wary of this idea -- could these inactive listeners become
> > active later and expect to be able to read their notifies?
> >
> I'm bit worry about this too.

What exactly do we mean by "active listener"? According to the source code,
the active listener (as far as I understand) is the one who listens to at least
one channel. If we have no active listeners in the database, the new listener
will set its pointer to the tail of the async queue. Thus, messages with old
xid will not be touched by anybody. I don't see any point in dropping them
in this case.

If the "inactive" listener is the backend which is stuck somewhere, the
answer is "no" - this backend should be able to process all notifications.

>
> >> 2. If the answer to 1 is no, how can we teach VACUUM to respect the
> >> minimum xid stored in all AsyncQueueEntries?
> >
> > Maybe we can have AsyncQueueAdvanceTail return the oldest XID of
> > listeners, and back off the pg_clog truncation based on that. This
> > could be done by having a new boolean argument that says to look up the
> > XID from the PGPROC using BackendPidGetProc(QUEUE_BACKEND_PID) (which
> > would only be passed true by vac_update_datfrozenxid(), to avoid
> > overhead by other callers), then collect the oldest of those and return
> > it.
> >
> The problem with only considering the oldest XID of listeners is that
> IIUC we may have notifications without listeners, and in this case we
> may still get this error because when the LISTEN is executed we loop
> through the AsyncQueueEntry's on asyncQueueProcessPageEntries() and we
> call TransactionIdDidCommit() that raise the error before
> IsListeningOn(channel) is called.
>

Agree.

> > This does create the problem that an inactive listener could cause the
> > XID counter to stay far in the past. Maybe we could try to avoid this
> > by adding more signalling (e.g, AsyncQueueAdvanceTail() itself could
> > send PROCSIG_NOTIFY_INTERRUPT signal?), and terminating backends that
> > are way overdue on reading notifies. I'm not sure if this is really
> > needed or useful; consider a backend stuck on SIGSTOP (debugger or
> > whatever): it will just sit there forever.
> >
> With this idea that I've proposed we still could have this problem, if a
> listener take too long to consume a message we would block vacuum freeze
> to advance the xid. For this I think that we could have two GUC's; One
> to enable and disable the oldest xmin check on async queue and the
> second to control how far we want to prevent the vacuum from freezing
> the oldest async queue xid, and if the min xid raises this limit we
> ignore and truncate the xid.
>

1) About Álvaro's comment:
I don't think that killing the lagging backend will save us, because entries
in the async queue are not ordered by xid. Thus, even after killing such a
backend, we have no guarantees that the problem is gone, because not
lagging backends still may encounter a super old xid.

2) About Matheus's comment:
I guess that there is only one reliable way to determine the precise minimal xid
in the queue (after message processing) - scan all pages from head to tail.
It obviously can take a lot of time. The second GUC (if I get it
right) logic is based
on minimal xid determination, so it won't be the best solution for
highload systems
(that are forced to turn on the first GUC, because of safety requirements).

> Another option would be to add a minXid field on AsyncQueueControl and
> then update this value on asyncQueueProcessPageEntries() and
> asyncQueueAddEntries() routines, and then we could check this value on
> vac_update_datfrozenxid().
>
> I've write a draft patch that plays with the idea, see attached.

Thanks for the patch! I have few comments on it :
1)
Maybe we should add an assert that NotifyQueueLock is held to the StoreEntryXid
function?

2)
For fields in AsyncQueueControl we have macros like QUEUE_HEAD,
QUEUE_TAIL and so on. Maybe minXid should also be wrapped with something
like QUEUE_MIN_XID?

3)
This logic will not work with (for example) two listeners. Steps :
session1=# listen c1;
session2=# listen c1;
session3=# notify c1; // sets minimal xid to N inside StoreEntryXid
session3=# notify c1;
session1=# listen c1; // sets minimal xid to (N + 1) inside ReleaseEntryXid

After these steps, session2 still needs to process a message with xid = N, but
autovacuum legally can freeze it and truncate clog.

Maybe we should check whether there are no listeners with smaller queue
position compared to the current backend?

4)
After ReleaseEntryXid, minXid field contains not actually minimal xid
in the queue,
but maximum processed xid. Maybe this parameter can be renamed?

--
Best regards,
Daniil Davydov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-08-13 19:29:59 Re: Making type Datum be 8 bytes everywhere
Previous Message Pavel Stehule 2025-08-13 19:24:09 Re: proposal: schema variables