Re: Introduce XID age and inactive timeout based replication slot invalidation

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-04-03 09:27:55
Message-ID: CAJpy0uBoK=d6Eoz03mrDK1+BK-mQmGre+Pn2GQcs3RFBTghk2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > > > Or a simple solution is that the slotsync worker updates
> > > > inactive_since as it does for non-synced slots, and disables
> > > > timeout-based slot invalidation for synced slots.
> >
> > I like this idea better, it takes care of such a case too when the
> > user is relying on sync-function rather than worker and does not want
> > to get the slots invalidated in between 2 sync function calls.
>
> Please find the attached v31 patches implementing the above idea:
>

Thanks for the patches, please find few comments:

v31-001:

1)
system-views.sgml:
value will get updated after every synchronization from the
corresponding remote slot on the primary.

--This is confusing. It will be good to rephrase it.

2)
update_synced_slots_inactive_since()

--May be, we should mention in the header that this function is called
only during promotion.

3) 040_standby_failover_slots_sync.pl:
We capture inactive_since_on_primary when we do this for the first time at #175
ALTER SUBSCRIPTION regress_mysub1 DISABLE"

But we again recreate the sub and disable it at line #280.
Do you think we shall get inactive_since_on_primary again here, to be
compared with inactive_since_on_new_primary later?

v31-002:
(I had reviewed v29-002 but missed to post comments, I think these
are still applicable)

1) I think replication_slot_inactivity_timeout was recommended here
(instead of replication_slot_inactive_timeout, so please give it a
thought):
https://www.postgresql.org/message-id/202403260739.udlp7lxixktx%40alvherre.pgsql

2) Commit msg:
a)
"It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped."

Shall we say invalidated rather than dropped?

b)
"To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot."

Broken sentence.

<have not reviewed 002 patch in detail yet>

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-04-03 09:36:50 Re: remaining sql/json patches
Previous Message Daniel Gustafsson 2024-04-03 09:11:31 Re: Add missing error codes to PANIC/FATAL error reports in xlog.c and relcache.c