Re: min_safe_lsn column in pg_replication_slots view

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: michael(at)paquier(dot)xyz
Cc: amit(dot)kapila16(at)gmail(dot)com, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: min_safe_lsn column in pg_replication_slots view
Date: 2020-06-19 03:13:56
Message-ID: 20200619.121356.2101874112165807899.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in
> >> It is a little unclear to me how this or any proposed patch will solve
> >> the original problem reported by Fujii-San? Basically, the problem
> >> arises because we don't have an interlock between when the checkpoint
> >> removes the WAL segment and the view tries to acquire the same. Am, I
> >> missing something?
>
> The proposed patch fetches the computation of the minimum LSN across
> all slots before taking ReplicationSlotControlLock so its value gets
> more lossy, and potentially older than what the slots actually
> include. So it is an attempt to take the safest spot possible.

Minimum LSN (lastRemovedSegNo) is not protected by the lock. That
makes no defference.

> Honestly, I find a bit silly the design to compute and use the same
> minimum LSN value for all the tuples returned by
> pg_get_replication_slots, and you can actually get a pretty good

I see it as silly. I think I said upthread that it was the distance
to the point where the slot loses a segment, and it was rejected but
just removing it makes us unable to estimate the distance so it is
there.

> estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
> directly with what pg_replication_slot provides as we have a min()
> aggregate for pg_lsn.

min(lastRemovedSegNo) is the earliest value. It is enough to read it
at the first then use it in all slots.

> For these reasons, I think that we should remove for now this
> information from the view, and reconsider this part more carefully for
> 14~ with a clear definition of how much lossiness we are ready to
> accept for the information provided here, if necessary. We could for
> example just have a separate SQL function that just grabs this value
> (or a more global SQL view for XLogCtl data that includes this data).

I think, we need at least one of the "distance" above or min_safe_lsn
in anywhere reachable from users.

> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
>
> We should really not do that anyway for a monitoring view.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-06-19 03:28:30 Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans
Previous Message movead.li@highgo.ca 2020-06-19 03:12:12 Re: POC and rebased patch for CSN based snapshots