Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-19 19:18:55
Message-ID: CALj2ACXib_+Kuy3_WbTBJkURYGwO7ngS8+HYATReJtXShNCtNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> > > Hm. Are you suggesting inactive_timeout to be a slot level parameter
> > > similar to 'failover' property added recently by
> > > c393308b69d229b664391ac583b9e07418d411b6 and
> > > 73292404370c9900a96e2bebdc7144f7010339cf?
> >
> > Yeah, I have something like that in mind. You can prepare the patch
> > but it would be good if others involved in this thread can also share
> > their opinion.
>
> I think it makes sense to put the inactive_timeout granularity at the slot
> level (as the activity could vary a lot say between one slot linked to a
> subcription and one linked to some plugins). As far max_slot_xid_age I've the
> feeling that a new GUC is good enough.

Well, here I'm implementing the above idea. The attached v12 patches
majorly have the following changes:

1. inactive_timeout is now slot-level, that is, one can set it while
creating the slot either via SQL functions or via replication commands
or via subscription.
2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.
3. last_inactive_at is now set even for non-walsenders whenever the
slot is released as opposed to initial versions of the patches setting
it only for walsenders.
4. slot's inactive_timeout parameter is now migrated to the new
cluster with pg_upgrade.
5. slot's inactive_timeout parameter is now synced to the standby when
failover is enabled for the slot.
6. Test cases are added to cover most of the above cases including new
invalidation mechanisms.

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.
2. Where to do XID age invalidation exactly if not the checkpointer.
3. How to go about recomputing XID horizons based on max_slot_xid_age.
Does the slot's horizon's need to be adjusted in ComputeXidHorizons()?
4. New invalidation mechanisms interaction with slot sync feature.
5. Review comments on 0001 from Bertrand.

Please see the attached v12 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v12-0001-Track-invalidation_reason-in-pg_replication_slot.patch application/octet-stream 20.2 KB
v12-0002-Track-last_inactive_at-for-replication-slots.patch application/octet-stream 8.0 KB
v12-0003-Allow-setting-inactive_timeout-for-replication-s.patch application/octet-stream 33.2 KB
v12-0004-Allow-setting-inactive_timeout-in-the-replicatio.patch application/octet-stream 17.9 KB
v12-0005-Add-inactive_timeout-option-to-subscriptions.patch application/octet-stream 61.1 KB
v12-0006-Add-inactive_timeout-based-replication-slot-inva.patch application/octet-stream 15.1 KB
v12-0007-Add-XID-age-based-replication-slot-invalidation.patch application/octet-stream 13.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-19 19:58:38 Why is parula failing?
Previous Message Ayush Vatsa 2024-03-19 19:17:39 Re: Proposal to include --exclude-extension Flag in pg_dump