Re: [HACKERS] Restricting maximum keep segments by repslots

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, sk(at)zsrv(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2018-07-20 01:13:58
Message-ID: CAD21AoDayePWwu4t=VPP5P1QFDSBvks1d8j76bXp5rbXoPbZcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 13, 2018 at 5:40 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hello.
>
> At Wed, 11 Jul 2018 15:09:23 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoCFtW6+SN_eVTszDAjQeeU2sSea2VpCEx08ejNafk8H9w(at)mail(dot)gmail(dot)com>
>> On Mon, Jul 9, 2018 at 2:47 PM, Kyotaro HORIGUCHI
>> ----
>> min_keep_lsn in pg_replication_slots currently shows the same value in
>> every slots but I felt that the value seems not easy to understand
>> intuitively for users because users will have to confirm that value
>> and to compare the current LSN in order to check if replication slots
>> will become the "lost" status. So how about showing values that
>> indicate how far away from the point where we become "lost" for
>> individual slots?
>
> Yeah, that is what I did in the first cut of this patch from the
> same thought. pg_replication_slots have two additional columns
> "live" and "distance".
>
> https://www.postgresql.org/message-id/20171031.184310.182012625.horiguchi.kyotaro@lab.ntt.co.jp
>
> Thre current design is changed following a comment.
>
> https://www.postgresql.org/message-id/20171108.131431.170534842.horiguchi.kyotaro%40lab.ntt.co.jp
>
>> > I don't think 'distance' is a good metric - that's going to continually
>> > change. Why not store the LSN that's available and provide a function
>> > that computes this? Or just rely on the lsn - lsn operator?
>>
>> It seems reasonable.,The 'secured minimum LSN' is common among
>> all slots so showing it in the view may look a bit stupid but I
>> don't find another suitable place for it. distance = 0 meant the
>> state that the slot is living but insecured in the previous patch
>> and that information is lost by changing 'distance' to
>> 'min_secure_lsn'.
>
> As I reconsidered this, I noticed that "lsn - lsn" doesn't make
> sense here. The correct formula for the value is
> "max_slot_wal_keep_size * 1024 * 1024 - ((oldest LSN to keep) -
> restart_lsn). It is not a simple formula to write by hand but
> doesn't seem general enough. I re-changed my mind to show the
> "distance" there again.
>
> pg_replication_slots now has the column "remain" instaed of
> "min_keep_lsn", which shows an LSN when wal_status is "streaming"
> and otherwise "0/0". In a special case, "remain" can be "0/0"
> while "wal_status" is "streaming". It is the reason for the
> tristate return value of IsLsnStillAvaialbe().
>
> wal_status | remain
> streaming | 0/19E3C0 -- WAL is reserved
> streaming | 0/0 -- Still reserved but on the boundary
> keeping | 0/0 -- About to be lost.
> lost | 0/0 -- Lost.
>

The "remain" column still shows same value at all rows as follows
because you always compare between the current LSN and the minimum LSN
of replication slots. Is that you expected? My comment was to show the
distance from the restart_lsn of individual slots to the critical
point where it will lost WAL. That way, we can easily find out which
slots is about to get lost.

postgres(1:126712)=# select pg_current_wal_lsn(), slot_name,
restart_lsn, remain from pg_replication_slots ;
pg_current_wal_lsn | slot_name | restart_lsn | remain
--------------------+-----------+-------------+------------
0/4000108 | 5 | 0/1645CA0 | 0/3DFFFEF8
0/4000108 | 4 | 0/40000D0 | 0/3DFFFEF8
(2 rows)

Also, I'm not sure it's a good way to show the distance as LSN. LSN is
a monotone increasing value but in your patch, a value of the "remain"
column can get decreased. As an alternative way I'd suggest to show it
as the number of segments. Attached patch is a patch for your v5 patch
that changes it so that the column shows how many WAL segments of
individual slots are remained until they get lost WAL.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
report_remaining_xlogsegs.patch application/octet-stream 9.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karen Huddleston 2018-07-20 01:26:06 Re: Making "COPY partitioned_table FROM" faster
Previous Message Amit Langote 2018-07-20 00:30:34 Re: documentation about explicit locking