Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-13 03:51:32
Message-ID: CAA4eK1++=fOpmsQKugQQRtQr2wmiintjKEhpuOch9=phxirAfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 12, 2024 at 8:55 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > Hm. I get the concern. Are you okay with having inavlidation_reason
> > > separately for both logical and physical slots? In such a case,
> > > logical slots that got invalidated on the standby will have duplicate
> > > info in conflict_reason and invalidation_reason, is this fine?
> > >
> >
> > If we have duplicate information in two columns that could be
> > confusing for users. BTW, isn't the recovery conflict occur only
> > because of rows_removed and wal_level_insufficient reasons? The
> > wal_removed or the new reasons you are proposing can't happen because
> > of recovery conflict. Am, I missing something here?
>
> My understanding aligns with yours that the rows_removed and
> wal_level_insufficient invalidations can occur only upon recovery
> conflict.
>
> FWIW, a test named 'synchronized slot has been invalidated' in
> 040_standby_failover_slots_sync.pl inappropriately uses
> conflict_reason = 'wal_removed' logical slot on standby. As per the
> above understanding, it's inappropriate to use conflict_reason here
> because wal_removed invalidation doesn't conflict with recovery.
>
> > > Another idea is to make 'conflict_reason text' as a 'conflicting
> > > boolean' again (revert 007693f2a3), and have 'invalidation_reason
> > > text' for both logical and physical slots. So, whenever 'conflicting'
> > > is true, one can look at invalidation_reason for the reason for
> > > conflict. How does this sound?
> > >
> >
> > So, does this mean that conflicting will only be true for some of the
> > reasons (say wal_level_insufficient, rows_removed, wal_removed) and
> > logical slots but not for others? I think that will also not eliminate
> > the duplicate information as user could have deduced that from single
> > column.
>
> So, how about we turn conflict_reason to only report the reasons that
> actually cause conflict with recovery for logical slots, something
> like below, and then have invalidation_cause as a generic column for
> all sorts of invalidation reasons for both logical and physical slots?
>

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-13 03:53:11 Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Previous Message vignesh C 2024-03-13 03:48:48 Re: Race condition in FetchTableStates() breaks synchronization of subscription tables