Re: Introduce XID age and inactive timeout based replication slot invalidation

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-26 09:47:36
Message-ID: CAJpy0uDkTW+t1k3oPkaipFBzZePfFNB5DmiA==pxRGcAdpF=Pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >
> > > 2 ===
> > >
> > > It looks like inactive_since is set to the current timestamp on the standby
> > > each time the sync worker does a cycle:
> > >
> > > primary:
> > >
> > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> > > slot_name | inactive_since
> > > -------------+-------------------------------
> > > lsub27_slot | 2024-03-26 07:39:19.745517+00
> > > lsub28_slot | 2024-03-26 07:40:24.953826+00
> > >
> > > standby:
> > >
> > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
> > > slot_name | inactive_since
> > > -------------+-------------------------------
> > > lsub27_slot | 2024-03-26 07:43:56.387324+00
> > > lsub28_slot | 2024-03-26 07:43:56.387338+00
> > >
> > > I don't think that should be the case.
> > >
> >
> > But why? This is exactly what we discussed in another thread where we
> > agreed to update inactive_since even for sync slots.
>
> Hum, I thought we agreed to "sync" it and to "update it to current time"
> only at promotion time.

I think there may have been some misunderstanding here. But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots. The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

I am fine with any of these approaches. One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

thanks
Shveta

Attachment Content-Type Size
v1-0001-inactive_since-for-synced-slots.patch.txt text/plain 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-03-26 10:14:29 Re: pgsql: Track last_inactive_time in pg_replication_slots.
Previous Message Bertrand Drouvot 2024-03-26 09:42:40 Re: Introduce XID age and inactive timeout based replication slot invalidation