Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
Cc: Joel Jacobson <joel(at)compiler(dot)org>, Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Rishu Bagga <rishu(dot)postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Daniil Davydov <3danissimo(at)gmail(dot)com>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date: 2025-10-21 01:31:54
Message-ID: CAD21AoCFZxXCBy+5DoarfG9LC9VdNwWRDpDHE5sdTh5Ym0EcqQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 20, 2025 at 5:37 AM Matheus Alcantara
<matheusssilv97(at)gmail(dot)com> wrote:
>
> On Sat Oct 18, 2025 at 2:43 AM -03, Joel Jacobson wrote:
> > On Fri, Oct 17, 2025, at 22:50, Arseniy Mukhin wrote:
> > What a funny coincidence that the approach in this patch,
> > has one similarity with the "Direct advancement" approach
> > in the patch in the "Optimize LISTEN/NOTIFY" [1] thread,
> > namely that we're both interested in QUEUE_HEAD before/after
> > we push the notifications into the queue, in PreCommit_Notify().
> >
> > It looks to me like our new data structures are Interchangeable,
> > so I guess we probably want both patches to eventually settle
> > on one and the same?
> >
> > The differences I note between our queue head before/after code are:
> >
> > - In this patch, you are palloc'ing a struct with two fields.
> > In [1], we're using two separate static QueuePosition variables.
> >
> > - In this patch, you are taking/releasing a shared lock before/after
> > the loop to read QUEUE_HEAD and set previousHead/head.
> > In [1], we avoid the need of the shared lock, by doing the reads
> > within the existing exclusive lock inside the loop, but instead
> > therefore need a firstIteration bool, to know which is the first
> > iteration, and need to overwrite the after-var in each iteration.
> >
> > I don't think the noted differences above matter, both seems fine.
> >
> Yeah, I also think that both approach seems fine. I keep the v8 version
> with the palloc, if someone has any concern about this I'm open to
> switch to another approach.
>
> > Another thing I noticed in your patch that made me wonder,
> > is the naming of the new AsyncQueueEntry bool field,
> > which is given the name "committed".
> >
> > I think this name is not entirely faithful, since when set to true,
> > the entry has not been committed yet.
> >
> > How about negating the meaning of this boolean field?
> > To instead indicate when the entry has been rollbacked.
> > Then, it would clearly communicate just that.
> >
> > Maybe naming it something like "rollbacked" or "aborted"?
> >
> Good point. I've renamed this field on the attached v8 version.

I've reviewed the v8 patch and I'm not sure it's a bullet-proof
approach. The basic idea of the v8 patch is to add async entries with
rollbacked=false and we set rollbacked=true in
asyncQueueRollbackNotifications called from AtAbort_Notify() if an
error happens between PreCommit_Notify() and AtCommit_Notify(). The
asyncQueueRollbackNotifications() reads SLRU pages to mark entries
'rollbacked' but reading SLRU pages could fail for some reason. In
this case, we would end up with a PANIC for recursive errors, or even
if we skip marking entries we would end up leaving entries as
committed. Also, if there are a lot of notifications across multiple
SLRU pages we need to mark as 'rollbacked',
asyncQueueRollbackNotifications() could take time to complete but it's
not interruptible. I don't think it's a good idea to introduce such an
operation in abort paths.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-10-21 01:36:28 Re: Skip unregistered custom kinds on stats load
Previous Message Sami Imseih 2025-10-21 01:16:51 Re: Skip unregistered custom kinds on stats load