| From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
|---|---|
| To: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
| Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, John H <johnhyvr(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Introduce XID age based replication slot invalidation |
| Date: | 2026-03-23 16:00:00 |
| Message-ID: | CALj2ACX_o+dKeAaK76mpAtG646UnDHpGUWziUkCvicVz8mz6=A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Fri, Mar 20, 2026 at 11:29 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram(at)gmail(dot)com> wrote:
>
> Do you think we need different GUCs for catalog_xmin and xmin? If table bloat is a concern (not catalog bloat), then logical slots are not required to invalidate unless the cluster is close to wraparound.
IMO the main purpose of max_slot_xid_age is to prevent XID wraparound.
For bloat, I still think max_slot_wal_keep_size is the better choice.
Where max_slot_xid_age is really useful is when the vacuum can't
freeze because a replication slot (physical or logical) is holding
back the XID horizon and the system is getting close to wraparound.
Invalidating such a slot clears the way for vacuum. Setting
max_slot_xid_age above vacuum_failsafe_age allows vacuum to waste
cycles scanning tables it cannot freeze. Keeping max_slot_xid_age <=
vacuum_failsafe_age (default 1.6B) prevents this by invalidating the
slot before vacuum effort is wasted.
As far as XID wraparound is concerned, both xmin and catalog_xmin need
to be treated similarly. Either one can hold back freezing and push
the system toward wraparound. So I don't think we need separate GUCs
for xmin and catalog_xmin unless I'm missing something. One GUC
covering both keeps things simple.
>> I made the following design choice: try invalidating only once per
>> vacuum cycle, not per table. While this keeps the cost of checking
>> (incl. the XidGenLock contention) for invalidation to a minimum when
>> there are a large number of tables and replication slots, it can be
>> less effective when individual tables/indexes are large. Invalidating
>> during checkpoints can help to some extent with the large table/index
>> cases. But I'm open to thoughts on this.
>
> It may not solve the intent when the vacuum cycle is longer, which one can expect on a large database particularly when there is heavy bloat.
This design choice boils down to the following: a database instance
having either 1/ a large number of small tables or 2/ large tables.
From my experience, I have seen both cases but mostly case 2 (others
can correct me). In this context, having an XID age based slot
invalidation check once per relation makes sense. However, I'm open to
more thoughts here.
>> Please find the attached patch for further review. I fixed the XID age
>> calculation in ReplicationSlotIsXIDAged and adjusted the code
>> comments.
>
> I applied the patch and all the tests passed. A few comments:
Thank you for reviewing the patch.
> @@ -495,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg
> MemoryContext vac_context, bool isTopLevel)
> {
> static bool in_vacuum = false;
> -
> + static bool first_time = true;
>
> first_time variable is not self explanatory, maybe something like try_replication_slot_invalidation and add comments that it will be set to false after the first check?
+1. Changed the variable name and simplified the comments around.
> + if (TransactionIdIsValid(xmin))
> + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."),
> + xmin,
> + max_slot_xid_age);
>
> Slot invalidates even when the age is max_slot_xid_age, isn't it?
Nice catch! I changed it to use TransactionIdPrecedes so it matches
the above error message like the two of the existing XID age GUCs
(autovacuum_freeze_max_age, vacuum_failsafe_age).
Please find the attached v2 patch for further review. Thank you!
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Add-XID-age-based-replication-slot-invalidation.patch | application/x-patch | 23.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jim Jones | 2026-03-23 16:07:24 | Re: Adding REPACK [concurrently] |
| Previous Message | Andres Freund | 2026-03-23 15:50:18 | Re: Bug in pg_get_aios() |