Re: [HACKERS] Restricting maximum keep segments by repslots

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com, andres(at)anarazel(dot)de, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2018-07-31 09:11:21
Message-ID: 20180731.181121.32422923.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hello.

At Tue, 24 Jul 2018 16:47:41 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoD0rChq7wQE=_o95quopcQGjcVG9omwdH07nT5cm81hzg(at)mail(dot)gmail(dot)com>
> On Mon, Jul 23, 2018 at 4:16 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Hello.
> >
> > At Fri, 20 Jul 2018 10:13:58 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoDayePWwu4t=VPP5P1QFDSBvks1d8j76bXp5rbXoPbZcA(at)mail(dot)gmail(dot)com>
..
> > Instead, I made the field be shown in flat "bytes" using bigint,
> > which can be nicely shown using pg_size_pretty;
>
> Thank you for updating. I agree showing the remain in bytes.
>
> Here is review comments for v6 patch.
>
> @@ -967,9 +969,9 @@ postgres=# SELECT * FROM
> pg_create_physical_replication_slot('node_a_slot');
> node_a_slot |
>
> postgres=# SELECT * FROM pg_replication_slots;
> - slot_name | slot_type | datoid | database | active | xmin |
> restart_lsn | confirmed_flush_lsn
> --------------+-----------+--------+----------+--------+------+-------------+---------------------
> - node_a_slot | physical | | | f | | |
> + slot_name | slot_type | datoid | database | active | xmin |
> restart_lsn | confirmed_flush_lsn | wal_status | min_keep_lsn
> +-------------+-----------+--------+----------+--------+------+-------------+---------------------+------------+--------------
> + node_a_slot | physical | | | f | |
> | | unknown | 0/1000000
>
> This funk should be updated.

Perhaps you need a fresh database cluster.

> -----
> +/*
> + * Returns minimum segment number the next checktpoint must leave considering
> + * wal_keep_segments, replication slots and max_slot_wal_keep_size.
> + *
> + * If resetBytes is not NULL, returns remaining LSN bytes to advance until any
> + * slot loses reserving a WAL record.
> + */
> +static XLogSegNo
> +GetOldestKeepSegment(XLogRecPtr currLSN, XLogRecPtr minSlotLSN,
> XLogRecPtr restartLSN, uint64 *restBytes)
> +{
>
> You're assuming that the minSlotLSN is the minimum LSN of replication
> slots but it's not mentioned anywhere. Since you check minSlotSeg <=

I added description for parameters in the function comment.

> currSeg but not force it, if a caller sets a wrong value to minSlotLSN
> this function will return a wrong value with no complaints. Similarly

I don't think such case can happen on a sane system. Even that
happenes it behaves in the same way as minSlotLSN being invalid
in the case. KeepLogSeg() also behaves in the same way and the
wal recycling will be performed as pg_replication_losts
predicted. Nothing can improve the behavior and I don't think
placing assertion there is overkill.

> there is not explanation about the resetartLSN, so you can add it. I'm
> not sure the augment name restartLSN is suitable for the function in
> xlog.c but I'd defer it to committers.

Done.

> Since this function assumes that both restartLSN and *restBytes are
> valid or invalid (and NULL) it's better to add assertions for safety.
> The current code accepts even the case where only either argment is
> valid.
> -----
> + if (limitSegs > 0 && currSeg <= restartSeg + limitSegs)
> + {

Even if the caller gives InvalidRecPtr as restartLSN, which is an
insane situation, the function just treats the value as zero and
reuturns the "correct" value for the restartLSN, which doesn't
harm anything.

> + /*
> + * This slot still has all required segments.
> Calculate how many
> + * LSN bytes the slot has until it loses restart_lsn.
> + */
> + fragbytes = wal_segment_size - (currLSN %
> wal_segment_size);
> + *restBytes =
> + (restartSeg + limitSegs - currSeg) *
> wal_segment_size
> + + fragbytes;
> + }
> + }
>
> This code doesn't consider the case where wal_keep_segments >
> max_slot_keep_size. In the case I think we should use (currSeg -
> wal_keep_segments) as the lower bound in order to avoid showing
> "streaming" in the wal_status although the remain is 0.

Thanks. It should use keepSegs instead of limitSegs. Fixed.

> -----
> + *restBytes =
> + (restartSeg + limitSegs - currSeg) *
> wal_segment_size
> + + fragbytes;
>
> Maybe you can use XLogSegNoOffsetToRecPtr instead.

Indeed. I'm not sure it is easier to read, though. (Maybe the
functions should use wal_segment_size out-of-band. (That is, not
passed as a parameter)).

> -----
> + * 0 means that WAL record at targetLSN is alredy removed.
> + * 1 means that WAL record at tagetLSN is availble.
> + * 2 means that WAL record at tagetLSN is availble but about to be removed by
>
> s/alredy/already/
> s/tagetLSN/targetLSN/
> s/availble/available/
> -----
> + * If resetBytes is not NULL, returns remaining LSN bytes to advance until any
> + * slot loses reserving a WAL record.
>
> s/resetBytes/restBytes/

Ugggh! Sorry that my fingers are extra-fat.. Fixed. I rechecked
through the whole patch and found one more typo.

> -----
> + Specify the maximum size of WAL files
> + that <link linkend="streaming-replication-slots">replication
> + slots</link> are allowed to reatin in the <filename>pg_wal</filename>
>
> s/reatin/retain/

Thank you. I also found other leftovers in catalogs.sgml and
high-availability.sgml.

# The latter file seems needing amendment for v11.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v7-0004-Documentation-for-slot-limit-feature.patch text/x-patch 5.3 KB
v7-0003-TAP-test-for-the-slot-limit-feature.patch text/x-patch 5.3 KB
v7-0002-Add-monitoring-aid-for-max_slot_wal_keep_size.patch text/x-patch 11.4 KB
v7-0001-Add-WAL-relief-vent-for-replication-slots.patch text/x-patch 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yamaji, Ryo 2018-07-31 09:12:21 RE: [HACKERS] Cached plans and statement generalization
Previous Message MyungKyu LIM 2018-07-31 08:56:57 [Todo item] Add entry creation timestamp column to pg_stat_replication