Re: Report oldest xmin source when autovacuum cannot remove tuples

From: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
To: Japin Li <japinli(at)hotmail(dot)com>
Cc: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report oldest xmin source when autovacuum cannot remove tuples
Date: 2026-05-22 13:05:31
Message-ID: CAOzEurSa37w_wgP7SeVicsoo5ERXc4V7XQH2xee5YM-W9uRRBA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for your feedback!

On Mon, Mar 16, 2026 at 8:19 PM Japin Li <japinli(at)hotmail(dot)com> wrote:
>
> On Mon, 16 Mar 2026 at 15:59, wenhui qiu <qiuwenhuifx(at)gmail(dot)com> wrote:
> > HI Shinya
> >> typedef enum XidHorizonBlockerType
> >> {
> >> XHB_NONE = 0,
> >> XHB_ACTIVE_TRANSACTION,
> >> XHB_IDLE_IN_TRANSACTION,
> >> XHB_PREPARED_TRANSACTION,
> >> XHB_XMIN_ACTIVE_TRANSACTION,
> >> XHB_XMIN_IDLE_IN_TRANSACTION,
> >> XHB_HOT_STANDBY_FEEDBACK,
> >> XHB_REPLICATION_SLOT,
> >> }
> > Thank you for your working on this ,I have another small suggestion
> > The priority ordering encoded in XidHorizonBlockerType determines which blocker gets reported when multiple candidates
> > exist. In particular:
> >
> > ACTIVE_TRANSACTION
> > IDLE_IN_TRANSACTION
> > PREPARED_TRANSACTION
> >
> > Prepared transactions are currently ranked after idle-in-transaction sessions. Operationally, prepared transactions are
> > often harder for DBAs to resolve than idle sessions, so it might be worth clarifying the rationale behind this ordering
> > or reconsidering whether prepared transactions should have higher priority.
>
> Agreed. Explaining the reason for this priority is very helpful.

We always pick a blocker from the xid-match group first (it is the
transaction actually holding the horizon, while the xmin-match entries
are just held back by it). Within the xid-match group, the
active/idle/prepared order never matters: a given xid is owned by only
one backend, so when the horizon equals a proc's xid there is only one
matching entry, and it is exactly one of active, idle, or prepared. So
moving prepared ahead of idle would not change which blocker we
report.

> >> typedef enum XidHorizonBlockerType
> >> {
> >> XHB_NONE = 0,
> >> XHB_ACTIVE_TRANSACTION,
> >> XHB_PREPARED_TRANSACTION,
> >> XHB_IDLE_IN_TRANSACTION,
> >> XHB_XMIN_ACTIVE_TRANSACTION,
> >> XHB_XMIN_IDLE_IN_TRANSACTION,
> >> XHB_HOT_STANDBY_FEEDBACK,
> >> XHB_REPLICATION_SLOT,
> >> }
> > Another one:
> > Currently GetXidHorizonBlocker() selects only one blocker (based on the enum priority) even though multiple independent
> > sources could hold back the xmin horizon simultaneously. For example, it is possible to have both a prepared transaction
> > and a replication slot preventing the horizon from advancing.
> > Have you considered reporting all detected blockers instead of just the highest-priority one? Returning only a single
> > entry might hide other relevant blockers from the user.
> >
>
> I'm also curious — why don't we list all the blockers? Did I miss anything?

I did think about this, but I would like to keep reporting one blocker
in the VACUUM log, for two reasons.

First, the log can get very large. In Sami's earlier example [0], a
pgbench run had many backends all sharing the same xmin while only one
idle-in-transaction backend actually owned the cutoff xid. Reporting
every blocker would print 20+ lines, almost all of them just victims
of the same root cause, which makes the log harder to read, not
easier.

Second, the one blocker we report is the root cause (the xid owner).
Once the DBA resolves it, the next VACUUM will show the next blocker
if one remains.

This is also why the code is split into GetXidHorizonBlockers(), which
already collects every candidate, and GetXidHorizonBlocker(), which
picks the highest-priority one for the log. The "show everything" case
is what I would like to expose later through a dynamic statistics
view, where a full list makes more sense than in a VACUUM log line.

I've rebased the patch.

P.S. It might be better to use bottom posting instead of top posting [1].

[0] https://www.postgresql.org/message-id/CAA5RZ0sjMgMo4Xg-niyyF-CpkQ_CK6uOfNKYT%3D9RmiBkAxQkbQ%40mail.gmail.com
[1] https://wiki.postgresql.org/wiki/Mailing_Lists#Email_etiquette_mechanics

--
Best regards,
Shinya Kato
NTT OSS Center

Attachment Content-Type Size
v3-0001-Report-oldest-xmin-blocker-when-VACUUM-cannot-rem.patch application/octet-stream 27.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Akshay Joshi 2026-05-22 13:32:46 Re: [PATCH] Add pg_get_policy_ddl() function to reconstruct CREATE POLICY statement
Previous Message Pavel Borisov 2026-05-22 12:39:33 Permission elevation by pg_amcheck operator overloading via search_path possible?