Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-26 07:45:19
Message-ID: ZgJ9D0p28ijwsjsl@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Mar 26, 2024 at 11:07:51AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
>
> Please find the attached v21 patch implementing the above idea. It
> also has changes for renaming last_inactive_time to inactive_since.

Thanks!

A few comments:

1 ===

One trailing whitespace:

Applying: Fix review comments for slot's last_inactive_time property
.git/rebase-apply/patch:433: trailing whitespace.
# got a valid inactive_since value representing the last slot sync time.
warning: 1 line adds whitespace errors.

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:39:19.745517+00
lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:43:56.387324+00
lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-03-26 07:56:32 Re: make dist using git archive
Previous Message walther 2024-03-26 07:43:54 Re: Regression tests fail with musl libc because libpq.so can't be loaded