Re: BUG #16961: Could not access status of transaction

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stepan Yankevych <Stepan_Yankevych(at)epam(dot)com>, Stepan Yankevych <stepya(at)ukr(dot)net>, "sergii(dot)zhuravlev(at)smartnet(dot)ua" <sergii(dot)zhuravlev(at)smartnet(dot)ua>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16961: Could not access status of transaction
Date: 2021-06-05 21:31:28
Message-ID: 20210605213128.GF228552@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Jun 05, 2021 at 05:25:39PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Mon, Apr 19, 2021 at 06:49:15PM +0000, Stepan Yankevych wrote:
> >> The issue happening each morning when application starts on the production DataBase during about a month.
> >> Always the same transaction id is mentioned in the error (1954017648)
> >> We tried to do UNLISTEN - no changes. the same issue.
> >> LISTEN works good for any other channels.
>
> > ...
> > On the other hand, if that's what happened, your report that "LISTEN works
> > good for any other channels" surprises me. Perhaps something completely
> > different happened on your system.
>
> I suspect the true state of affairs was more like "LISTEN works good in
> any other databases". The described symptoms are consistent with there
> being a message in the NOTIFY queue that has a pruned-away xid. An
> incoming listener would try to scan over already-committed notify
> messages, but testing to see whether this stale message is committed
> would fail. However, since asyncQueueProcessPageEntries just ignores
> messages not targeted for the current database, incoming listeners
> in other databases wouldn't notice the problem.
>
> ISTM the interesting question here is what was holding back truncation
> of the NOTIFY queue. Could there have been an open transaction somewhere
> that was failing to collect NOTIFY data?

Open transactions always decline to collect notify data, don't they? See
ProcessNotifyInterrupt().

> But a transaction sitting open
> for a month is likely to cause far more severe problems than that one.

True.

> > We could prevent the trouble if vac_truncate_clog() had access to the oldest
> > xid in the notification queue; it would set frozenXID to that value if
> > frozenXID would otherwise be older.
>
> Perhaps. I'm not sure how hard it is to extract the oldest xid in the
> queue (where "oldest" is defined as "numerically smallest"). The entries
> are in xid commit order which is a different thing.

Yeah, it wouldn't be cheap in the general case. The value could be a field in
pg_control, updated by a separate VACUUM NOTIFY, which autovacuum would also
run roughly as often as autovacuum visits template0.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Noah Misch 2021-06-06 06:33:31 Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch
Previous Message Tom Lane 2021-06-05 21:25:39 Re: BUG #16961: Could not access status of transaction