Re: [HACKERS] Restricting maximum keep segments by repslots

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com, andres(at)anarazel(dot)de, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2018-09-10 10:19:21
Message-ID: 20180910.191921.82261601.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 6 Sep 2018 19:55:39 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoAZCdvdMN-vG4D_653vb_FN-AaMAP5+GXgF1JRjy+LeyA(at)mail(dot)gmail(dot)com>
> On Thu, Sep 6, 2018 at 4:10 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Thank you for the comment.
> >
> > At Wed, 5 Sep 2018 14:31:10 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoB-HJvL+uKsv40Gb8Dymh9uBBQUXTucqv4MDtH_AGKh4g(at)mail(dot)gmail(dot)com>
> >> On Tue, Sep 4, 2018 at 7:52 PM, Kyotaro HORIGUCHI
> >> Thank you for updating! Here is the review comment for v8 patch.
> >>
> >> + /*
> >> + * This slot still has all required segments. Calculate how many
> >> + * LSN bytes the slot has until it loses restart_lsn.
> >> + */
> >> + fragbytes = wal_segment_size - (currLSN % wal_segment_size);
> >> + XLogSegNoOffsetToRecPtr(restartSeg + limitSegs - currSeg,
> >> fragbytes,
> >> + wal_segment_size, *restBytes);
> >>
> >> For the calculation of fragbytes, I think we should calculate the
> >> fragment bytes of restartLSN instead. The the formula "restartSeg +
> >> limitSegs - currSeg" means the # of segment between restartLSN and the
> >> limit by the new parameter. I don't think that the summation of it and
> >> fragment bytes of currenLSN is correct. As the following result
> >> (max_slot_wal_keep_size is 128MB) shows, the remain column shows the
> >> actual remains + 16MB (get_bytes function returns the value of
> >> max_slot_wal_keep_size in bytes).
> >
> > Since a oldest segment is removed after the current LSN moves to
> > the next segmen, current LSN naturally determines the fragment
> > bytes. Maybe you're concerning that the number of segments looks
> > too much by one segment.
> >
> > One arguable point of the feature is how max_slot_wal_keep_size
> > works exactly. I assume that even though the name is named as
> > "max_", we actually expect that "at least that bytes are
> > kept". So, for example, with 16MB of segment size and 50MB of
> > max_s_w_k_s, I designed this so that the size of preserved WAL
> > doesn't go below 50MB, actually (rounding up to multples of 16MB
> > of 50MB), and loses the oldest segment when it reaches 64MB +
> > 16MB = 80MB as you saw.
> >
> > # I believe that the difference is not so significant since we
> > # have around hunderd or several hundreds of segments in common
> > # cases.
> >
> > Do you mean that we should define the GUC parameter literally as
> > "we won't have exactly that many bytes of WAL segmetns"? That is,
> > we have at most 48MB preserved WAL records for 50MB of
> > max_s_w_k_s setting. This is the same to how max_wal_size is
> > counted but I don't think max_slot_wal_keep_size will be regarded
> > in the same way.
>
> I might be missing something but what I'm expecting to this feature is
> to restrict the how much WAL we can keep at a maximum for replication
> slots. In other words, the distance between the current LSN and the
> minimum restart_lsn of replication slots doesn't over the value of
> max_slot_wal_keep_size.

Yes, it's one possible design, the same with "we won't have more
than exactly that many bytes of WAL segmetns" above ("more than"
is added, which I meant). But anyway we cannot keep the limit
strictly since WAL segments are removed only at checkpoint
time. So If doing so, we can reach the lost state before the
max_slot_wal_keep_size is filled up meanwhile WAL can exceed the
size by a WAL flood. We can define it precisely at most as "wal
segments are preserved at most aorund the value". So I choosed
the definition so that we can tell about this as "we don't
guarantee more than that bytes".

# Uuuu. sorry for possiblly hard-to-read sentence..

> It's similar to wal_keep_segments except for
> that this feature affects only replication slots. And

It defines the *extra* segments to be kept, that is, if we set it
to 2, at least 3 segments are present. If we set
max_slot_wal_keep_size to 32MB (= 2 segs here), we have at most 3
segments since 32MB range before the current LSN almost always
spans over 3 segments. Doesn't this seemingly in a similar way
with wal_keep_segments?

If the current LSN is at the very last of a segment and
restart_lsn is catching up to the current LSN, the "remain" is
equal to max_slot_wal_keep_size as the guaranteed size. If very
beginning of a segments, it gets extra 16MB.

> wal_keep_segments cannot restrict WAL that replication slots are
> holding. For example, with 16MB of segment size and 50MB of
> max_slot_wal_keep_size, we can keep at most 50MB WAL for replication
> slots. However, once we consumed more than 50MB WAL while not
> advancing any restart_lsn the required WAL might be lost by the next
> checkpoint, which depends on the min_wal_size.

I don't get the last phrase. With small min_wal_size, we don't
recycle most of the "removed" segments. If large, we recycle more
of them. It doesn't affect up to where the checkpoint removes WAL
files. But it is right that LSN advance with
max_slot_wal_keep_size bytes immediately leands to breaking a
slot and it is intended behavior.

> On the other hand, if
> we mostly can advance restart_lsn to approximately the current LSN the
> size of preserved WAL for replication slots can go below 50MB.

Y..eah.. That's right. It is just how this works. But I don't
understand how this is related to the intepretation of the "max"
of the GUC variable.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2018-09-10 10:24:15 Re: Adding a note to protocol.sgml regarding CopyData
Previous Message Amit Khandekar 2018-09-10 10:12:55 Re: Query is over 2x slower with jit=on