Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-16 10:24:46
Message-ID: CAA4eK1+7Y6bknNNentcNfkmCBfm3j2fLjA4JjVydz3TB2MTjVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 15, 2024 at 10:45 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Wed, Mar 13, 2024 at 9:38 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > BTW, is XID the based parameter 'max_slot_xid_age' not have similarity
> > with 'max_slot_wal_keep_size'? I think it will impact the rows we
> > removed based on xid horizons. Don't we need to consider it while
> > vacuum computing the xid horizons in ComputeXidHorizons() similar to
> > what we do for WAL w.r.t 'max_slot_wal_keep_size'?
>
> I'm having a hard time understanding why we'd need something up there
> in ComputeXidHorizons(). Can you elaborate it a bit please?
>
> What's proposed with max_slot_xid_age is that during checkpoint we
> look at slot's xmin and catalog_xmin, and the current system txn id.
> Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses
> max_slot_xid_age, we invalidate the slot.
>

I can see that in your patch (in function
InvalidatePossiblyObsoleteSlot()). As per my understanding, we need
something similar for slot xids in ComputeXidHorizons() as we are
doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN
location required by slots and then adjust it for
'max_slot_wal_keep_size'. On similar lines, currently in
ComputeXidHorizons(), we compute the minimum xid required by slots
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this and try to move with another parameter
'inactive_replication_slot_timeout' which according to me can be kept
at slot level instead of a GUC but OTOH we need to see the arguments
on both side and then decide which makes more sense.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-03-16 10:56:22 Re: [HACKERS] make async slave to wait for lsn to be replayed
Previous Message Bharath Rupireddy 2024-03-16 07:02:43 Re: [HACKERS] make async slave to wait for lsn to be replayed