Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue

From: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com>, Rishu Bagga <rishu(dot)postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Daniil Davydov <3danissimo(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joel Jacobson <joel(at)compiler(dot)org>
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date: 2025-09-18 21:34:40
Message-ID: CAFY6G8fui7omKfUXF+JJ82B34ExrwK6J7vKe61S-DhEmr_jBhA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon Sep 15, 2025 at 2:40 PM -03, Masahiko Sawada wrote:
> While the WAL-based approach discussed on another thread is promising,
> I think it would not be acceptable for back branches as it requires
> quite a lot of refactoring. Given that this is a long-standing bug in
> listen/notify, I think we can continue discussing how to fix the issue
> on backbranches on this thread.
>
Please see the new attached patch, it has a different implementation
that I've previously posted which is based on the idea that Arseniy
posted on [1].

This new version include the "committed" field on AsyncQueueEntry struct
so that we can use this info when processing the notification instead of
call TransactionIdDidCommit()

The "committed" field is set to true when the AsyncQueueEntry is being
added on the SLRU page buffer when the PreCommit_Notify() is called. If
an error occurs between the PreCommit_Notify() and AtCommit_Notify() the
AtAbort_Notify() will be called and will set the "committed" field to
false for the notifications inside the aborted transaction.

It's a bit tricky to know at AtAbort_Notify() which notifications were
added on the SLRU page buffer by the aborted transaction, so I created a
new data structure and a global variable to keep track of this
information. See the commit message for more information.

On the previously patch that I've posted I've created a TAP test to
reproduce the issue with the VACUUM FREEZE, this new version also
include this test and also a new test case that use the injection points
extension to force an error between the PreCommit_Notify() and
AtCommit_Notify() so that we can ensure that these notifications of an
aborted transaction are not visible to other listener backends.

[1] https://www.postgresql.org/message-id/CAE7r3M%2BXwf0A_aNhu7pYQd7nDQaqaEnyCjtvqg8XsgManmPOUA%40mail.gmail.com

--
Matheus Alcantara

Attachment Content-Type Size
v1-0001-Make-AsyncQueueEntry-s-self-contained.patch application/octet-stream 15.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-09-18 22:53:15 Re: REPACK and naming
Previous Message Tomas Vondra 2025-09-18 21:04:45 Re: Adding basic NUMA awareness