Re: [HACKERS] Restricting maximum keep segments by repslots

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2018-09-06 10:55:39
Message-ID: CAD21AoAZCdvdMN-vG4D_653vb_FN-AaMAP5+GXgF1JRjy+LeyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 6, 2018 at 4:10 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Thank you for the comment.
>
> At Wed, 5 Sep 2018 14:31:10 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoB-HJvL+uKsv40Gb8Dymh9uBBQUXTucqv4MDtH_AGKh4g(at)mail(dot)gmail(dot)com>
>> On Tue, Sep 4, 2018 at 7:52 PM, Kyotaro HORIGUCHI
>> Thank you for updating! Here is the review comment for v8 patch.
>>
>> + /*
>> + * This slot still has all required segments. Calculate how many
>> + * LSN bytes the slot has until it loses restart_lsn.
>> + */
>> + fragbytes = wal_segment_size - (currLSN % wal_segment_size);
>> + XLogSegNoOffsetToRecPtr(restartSeg + limitSegs - currSeg,
>> fragbytes,
>> + wal_segment_size, *restBytes);
>>
>> For the calculation of fragbytes, I think we should calculate the
>> fragment bytes of restartLSN instead. The the formula "restartSeg +
>> limitSegs - currSeg" means the # of segment between restartLSN and the
>> limit by the new parameter. I don't think that the summation of it and
>> fragment bytes of currenLSN is correct. As the following result
>> (max_slot_wal_keep_size is 128MB) shows, the remain column shows the
>> actual remains + 16MB (get_bytes function returns the value of
>> max_slot_wal_keep_size in bytes).
>
> Since a oldest segment is removed after the current LSN moves to
> the next segmen, current LSN naturally determines the fragment
> bytes. Maybe you're concerning that the number of segments looks
> too much by one segment.
>
> One arguable point of the feature is how max_slot_wal_keep_size
> works exactly. I assume that even though the name is named as
> "max_", we actually expect that "at least that bytes are
> kept". So, for example, with 16MB of segment size and 50MB of
> max_s_w_k_s, I designed this so that the size of preserved WAL
> doesn't go below 50MB, actually (rounding up to multples of 16MB
> of 50MB), and loses the oldest segment when it reaches 64MB +
> 16MB = 80MB as you saw.
>
> # I believe that the difference is not so significant since we
> # have around hunderd or several hundreds of segments in common
> # cases.
>
> Do you mean that we should define the GUC parameter literally as
> "we won't have exactly that many bytes of WAL segmetns"? That is,
> we have at most 48MB preserved WAL records for 50MB of
> max_s_w_k_s setting. This is the same to how max_wal_size is
> counted but I don't think max_slot_wal_keep_size will be regarded
> in the same way.

I might be missing something but what I'm expecting to this feature is
to restrict the how much WAL we can keep at a maximum for replication
slots. In other words, the distance between the current LSN and the
minimum restart_lsn of replication slots doesn't over the value of
max_slot_wal_keep_size. It's similar to wal_keep_segments except for
that this feature affects only replication slots. And
wal_keep_segments cannot restrict WAL that replication slots are
holding. For example, with 16MB of segment size and 50MB of
max_slot_wal_keep_size, we can keep at most 50MB WAL for replication
slots. However, once we consumed more than 50MB WAL while not
advancing any restart_lsn the required WAL might be lost by the next
checkpoint, which depends on the min_wal_size. On the other hand, if
we mostly can advance restart_lsn to approximately the current LSN the
size of preserved WAL for replication slots can go below 50MB.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-09-06 11:17:02 Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process
Previous Message Andrey Lepikhov 2018-09-06 10:26:47 Re: [WIP] [B-Tree] Retail IndexTuple deletion