Re: Do we need to handle orphaned prepared transactions in the server?

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Thomas Kellerer <shammat(at)gmx(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Do we need to handle orphaned prepared transactions in the server?
Date: 2020-01-23 04:56:41
Message-ID: CAMsr+YGSS6JBHmEHbxqMdc1XJ7sobDSq62YwaEkOHN-KBQYr-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 23 Jan 2020 at 01:20, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > I think the big question is whether we want to make active prepared
> > transactions more visible to administrators, either during server start
> > or idle duration.
>
> There's already the pg_prepared_xacts view ...

I think Bruce has a point here. We shouldn't go around "resolving"
prepared xacts, but the visibility of them is a problem for users.
I've seen that myself quite enough times, even now that they cannot be
used by default.

Our monitoring and admin views are not keeping up with Pg's
complexity. Resource retention is one area where that's becoming a
usability and admin challenge. If a user has growing bloat (and have
managed to figure that out, since we don't make it easy to do that
either) or unexpected WAL retention they may find it hard to quickly
work out why.

We could definitely improve on that by exposing a view that integrates
everything that holds down xmin and catalog_xmin. It'd show

* the datfrozenxid and datminmxid for the oldest database
* if that database is the current database, the relation(s) with the
oldest relfrozenxid and relminmxd
* ... and the catalog relation(s) with the oldest relfrozenxid and
relminmxid if greater
* the absolute xid and xid-age positions of entries in pg_replication_slots
* pg_stat_replication connections (joined to pg_stat_replication if
connected) with their feedback xmin
* pg_stat_activity backend_xid and backend_xmin for the backend(s)
with oldest values; this may be different sets of backends
* pg_prepared_xacts entries by oldest xid

... probably sorted by xid age.

It'd be good to expose some internal state too, which would usually
correspond to the oldest values found in the above, but is useful for
cross-checking:

* RecentGlobalXmin and RecentGlobalDataXmin to show the xmin and
catalog_xmin actually used
* procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin

I'm not sure whether WAL retention (lsn tracking) should be in the
same view or a different one, but I lean toward different.

I already have another TODO kicking around for me to write a view that
generates a blocking locks graph, since pg_locks is really more of a
building block than a directly useful view for admins to understand
the system's state. And if that's not enough I also want to write a
decent bloat-checking view to include in the system views, since IMO
lock-blocking, bloat, and resource retention are real monitoring pain
points right now.

--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-23 06:30:16 Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
Previous Message Amit Kapila 2020-01-23 04:50:19 Re: Error message inconsistency