Re: min_safe_lsn column in pg_replication_slots view

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: amit(dot)kapila16(at)gmail(dot)com, michael(at)paquier(dot)xyz, alvherre(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: min_safe_lsn column in pg_replication_slots view
Date: 2020-06-23 02:17:15
Message-ID: 3518948c-c175-c028-c2a4-78d85bd4c41f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/06/23 10:10, Kyotaro Horiguchi wrote:
> At Mon, 22 Jun 2020 22:02:51 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>>
>>
>> On 2020/06/22 21:01, Amit Kapila wrote:
>>> On Mon, Jun 22, 2020 at 11:19 AM Michael Paquier <michael(at)paquier(dot)xyz>
>>> wrote:
>>>>
>>>> On Sat, Jun 20, 2020 at 03:53:54PM +0900, Michael Paquier wrote:
>>>>> On Sat, Jun 20, 2020 at 09:45:52AM +0530, Amit Kapila wrote:
>>>>>> Isn't this information specific to checkpoints, so maybe better to
>>>>>> display in view pg_stat_bgwriter?
>>>>>
>>>>> Not sure that's a good match. If we decide to expose that, a separate
>>>>> function returning a LSN based on the segment number from
>>>>> XLogGetLastRemovedSegno() sounds fine to me, like
>>>>> pg_wal_last_recycled_lsn(). Perhaps somebody has a better name in
>>>>> mind?
>>>>
>>>> I was thinking on this one for the last couple of days, and came up
>>>> with the name pg_wal_oldest_lsn(), as per the attached, traking the
>>>> oldest WAL location still available.
>>
>> Thanks for the patch!
>>
>> + <literal>NULL</literal> if no WAL segments have been removed since
>> + startup.
>>
>> Isn't this confusing? I think that we should store the last removed
>> WAL segment to somewhere (e.g., pg_control) and restore it at
>> the startup, so that we can see the actual value even after the
>> startup.
>> Or we should scan pg_wal directory and find the "minimal" WAL segment
>> and return its LSN.
>
> Running a separate scan on pg_wal at startup or first time the oldest
> WAL segno is referenced is something that was rejected before. But
> with the current behavior we don't find the last removed segment until
> any slot loses a segment if all WAL files are retained by a slot.

Because scanning pg_wal can be heavy operation especially when
max_wal_size is high and there are lots of WAL files? If so, it might
be better to save the value in pg_control as I told upthread.

However I'm not sure the use case of this function yet...

> FWIW
> I recently proposed a patch to find the oldest WAL file while trying
> removing old WAL files.
>
>>> I feel such a function is good to have but I am not sure if there is a
>>> need to tie it with the removal of min_safe_lsn column.
>>
>> We should expose the LSN calculated from
>> "the current WAL LSN - max(wal_keep_segments * 16MB,
>> max_slot_wal_keep_size)"?
>> This indicates the minimum LSN of WAL files that are guaraneed to be
>> currently retained by wal_keep_segments and max_slot_wal_keep_size.
>> That is, if checkpoint occurs when restart_lsn of replication slot is
>> smaller than that minimum LSN, some required WAL files may be removed.
>> So DBAs can periodically monitor and compare restart_lsn and that
>> minimum
>> LSN. If they see frequently that difference of those LSN is very
>> small,
>> they can decide to increase wal_keep_segments or
>> max_slot_wal_keep_size,
>> to prevent required WAL files from being removed. Thought?
>
> I'm not sure about the consensus here about showing that number in the
> view. It is similar to "remain" in the earlier versions of this patch
> but a bit simpler. It would be usable in a similar way. I can live
> with either numbers.

It's useless to display this value in each replication slot in the view.
I'm thinking to expose it as a function.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-06-23 02:29:12 Re: Parallel Seq Scan vs kernel read ahead
Previous Message Michael Paquier 2020-06-23 02:16:33 Re: min_safe_lsn column in pg_replication_slots view