> 2026年4月30日 16:29,Rui Zhao 写道: > > Hi hackers, > > I'd like to propose a fix for a long-standing issue where hot standby > feedback catalog_xmin incorrectly holds back vacuuming of user data > tables on the primary when no physical replication slot is used. > > == Problem == > > When a standby sends hot standby feedback to a primary without a > physical replication slot, ProcessStandbyHSFeedbackMessage() takes > min(feedbackCatalogXmin, feedbackXmin) and stores it into > MyProc->xmin: > > if (TransactionIdIsNormal(feedbackCatalogXmin) > && TransactionIdPrecedes(feedbackCatalogXmin, feedbackXmin)) > MyProc->xmin = feedbackCatalogXmin; > else > MyProc->xmin = feedbackXmin; > > Since ComputeXidHorizons() treats proc->xmin uniformly for both data > and catalog horizons, the catalog_xmin ends up holding back > data_oldest_nonremovable, preventing vacuum from cleaning dead tuples > in regular user tables. > > The existing code even acknowledges this limitation: > > "We can only track the catalog xmin separately when using a slot, > so we store the least of the two provided when not using a slot." > > == Why this matters == > > One might argue "just use a replication slot." However, many > production HA deployments intentionally avoid physical replication > slots because of their lifecycle management complexity: > > - When a primary fails, physical slots on the old primary are lost > and cannot be automatically migrated to the promoted standby. > - Other standbys that were using slots on the old primary must > re-establish their slots on the new primary, potentially requiring > a fresh base backup. > - Dangling slots from disconnected standbys can cause unbounded WAL > accumulation until manually dropped. > > These deployments use wal_keep_size or WAL archiving for WAL > retention, combined with hot_standby_feedback for visibility horizon > management. This is a legitimate production configuration -- for > example, some HA frameworks (Patroni with certain configurations, > custom HA scripts) operate this way. > > The issue becomes severe when the standby also hosts a logical > replication slot (e.g., for change data capture or logical replication > to a downstream). The logical slot's catalog_xmin can be very old > (retained for logical decoding catalog access), and this old value > gets propagated to the primary's walsender via hot standby feedback, > blocking vacuum on ALL user data tables on the primary. This leads > to table bloat that is difficult to diagnose since the DBA may not > realize the connection between a standby's logical slot and the > primary's vacuum behavior. > > == Fix == > > The patch adds a catalog_xmin field to PGPROC (4 bytes), so the > walsender can track catalog_xmin separately from xmin even without a > replication slot. This mirrors how replication slots already separate > slot->data.xmin from slot->data.catalog_xmin. > > In ComputeXidHorizons(), the new proc_catalog_xmin is accumulated > from PGPROC entries and applied only to catalog_oldest_nonremovable > and shared_oldest_nonremovable -- exactly how slot_catalog_xmin is > already handled. It does NOT affect data_oldest_nonremovable. > > GetReplicationHorizons() is updated to include proc_catalog_xmin in > the catalog_xmin sent upstream, ensuring correct behavior in > cascading standby configurations. > > Changes summary: > - proc.h: add catalog_xmin to PGPROC > - proc.c: initialize catalog_xmin in InitProcess/InitAuxiliaryProcess > - procarray.c: accumulate and apply proc_catalog_xmin in > ComputeXidHorizons(); include in GetReplicationHorizons() > - walsender.c: set MyProc->xmin and MyProc->catalog_xmin separately > in the no-slot path of ProcessStandbyHSFeedbackMessage() > > == Alternatives considered == > > 1. Generalize the ephemeral slot concept (as suggested by the existing > XXX comment): this would automatically create a temporary slot for > slot-less walsenders. More invasive, requires slot allocation > (max_replication_slots), and adds slot lifecycle management. > > 2. Simply ignore catalog_xmin in the no-slot path: simpler but loses > catalog protection for the standby's logical decoding. > > The proposed approach is minimal, correct, and consistent with how > slots already handle the separation. > > == Testing == > > A new TAP test (053_hs_feedback_catalog_xmin.pl) verifies: > 1. With hot_standby_feedback=on and no physical replication slot, when > the standby has a logical slot with an old catalog_xmin, VACUUM on > the primary can still clean dead tuples in user data tables. > 2. The standby's logical slot catalog_xmin remains properly set, > confirming catalog protection is preserved. > > Patch attached. > > Regards, > Rui Zhao > >