From: | Daniil Davydov <3danissimo(at)gmail(dot)com> |
---|---|
To: | Matheus Alcantara <matheusssilv97(at)gmail(dot)com> |
Cc: | Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue |
Date: | 2025-08-19 03:57:42 |
Message-ID: | CAJDiXgheRHTnK8RJuBBcF0VjQA83wi81oX2s-GvBV3x+eSwBbA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, Aug 19, 2025 at 7:14 AM Matheus Alcantara
<matheusssilv97(at)gmail(dot)com> wrote:
>
> On Wed Aug 13, 2025 at 4:29 PM -03, Daniil Davydov wrote:
> >
> > What exactly do we mean by "active listener"? According to the source code,
> > the active listener (as far as I understand) is the one who listens to at least
> > one channel. If we have no active listeners in the database, the new listener
> > will set its pointer to the tail of the async queue. Thus, messages with old
> > xid will not be touched by anybody. I don't see any point in dropping them
> > in this case.
> >
> I think that this definition is correct, but IIUC the tail can still
> have notifications with xid's that were already truncated by vacuum
> freeze. When the LISTEN is executed, we first loop through the
> notification queue to try to advance the queue pointers and we can
> eventually iterate over a notification that was added on the queue
> without any listener but it has a xid that is already truncated by vacuum
> freeze, so in this case it will fail to get the transaction status. On
> Alex steps to reproduce the issue it first executes the NOTIFY and
> then executes the LISTEN which fails after vacuum freeze.
>
Yeah, you are right. I looked at the code again, and found out that even
if there are no active listeners, new listener should iterate from the head
to the tail. Thus, it may encounter truncated xid. Anyway, I still think that
dropping notifications is not the best way to resolve this issue.
> > If the "inactive" listener is the backend which is stuck somewhere, the
> > answer is "no" - this backend should be able to process all notifications.
> >
> I tried to reproduce the issue by using some kind of "inactive"
> listener but so far I didn't manage to trigger the error.
>
> After the vacuum freeze I still can see the same files on pg_xact/ and
> if I cancel the long query the notification is received correctly, and
> then if I execute vacuum freeze again on every database the oldest
> pg_xact file is truncated.
>
> So, if my tests are correct I don't think that storing the oldest xid is
> necessary anymore since I don't think that we can lose notifications
> using the patch from Daniil or I'm missing something here?
>
You have started a very long transaction, which holds its xid and prevents
vacuum from freezing it. But what if the backend is stuck not inside a
transaction? Maybe we can just hardcode a huge delay (not inside the
transaction) or stop process execution via breakpoint in gdb. If we will use it
instead of a long query, I think that this error may be reproducible.
--
Best regards,
Daniil Davydov
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Reshke | 2025-08-19 05:14:33 | Re: Sequence Access Methods, round two |
Previous Message | Jingtang Zhang | 2025-08-19 03:37:44 | Re: Memory leak of SMgrRelation object on standby |