Re: [HACKERS] Restricting maximum keep segments by repslots

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com, andres(at)anarazel(dot)de
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2018-10-25 12:55:18
Message-ID: 20181025.215518.189844649.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hello.

At Mon, 22 Oct 2018 19:35:04 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoBdfoLSgujPZ_TpnH5zdQz0jg-Y8OXtZ=TCO787Sey-=w(at)mail(dot)gmail(dot)com>
> On Thu, Sep 13, 2018 at 6:30 PM Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Sorry for the late response. The patch still can be applied to the

It's alright. Thanks.

> curent HEAD so I reviewed the latest patch.
> The value of 'remain' and 'wal_status' might not be correct. Although
> 'wal_stats' shows 'lost' but we can get changes from the slot. I've
> tested it with the following steps.
>
> =# alter system set max_slot_wal_keep_size to '64MB'; -- while
> wal_keep_segments is 0
> =# select pg_reload_conf();
> =# select slot_name, wal_status, remain, pg_size_pretty(remain) as
> remain_pretty from pg_replication_slots ;
> slot_name | wal_status | remain | remain_pretty
> -----------+------------+----------+---------------
> 1 | streaming | 83885648 | 80 MB
> (1 row)
>
> ** consume 80MB WAL, and do CHECKPOINT **
>
> =# select slot_name, wal_status, remain, pg_size_pretty(remain) as
> remain_pretty from pg_replication_slots ;
> slot_name | wal_status | remain | remain_pretty
> -----------+------------+--------+---------------
> 1 | lost | 0 | 0 bytes
> (1 row)
> =# select count(*) from pg_logical_slot_get_changes('1', NULL, NULL);
> count
> -------
> 15
> (1 row)

Mmm. The function looks into the segment already open before
losing the segment in the file system (precisely, its direcotory
entry has been deleted). So just 1 lost segment doesn't
matter. Please try losing more one segment.

=# select * from pg_logical_slot_get_changes('s1', NULL, NULL);
ERROR: unexpected pageaddr 0/29000000 in log segment 000000010000000000000023, offset 0

Or, instead just restarting will let the opened segment forgotten.

...
> 1 | lost | 0 | 0 bytes
(just restart)
> =# select * from pg_logical_slot_get_changes('s1', NULL, NULL);
> ERROR: requested WAL segment pg_wal/000000010000000000000029 has already been removed

I'm not sure this is counted to be a bug...

> -----
> I got the following result with setting of wal_keep_segments >
> max_slot_keep_size. The 'wal_status' shows 'streaming' although the
> 'remain' is 0.
>
> =# select slot_name, wal_status, remain from pg_replication_slots limit 1;
> slot_name | wal_status | remain
> -----------+------------+--------
> 1 | streaming | 0
> (1 row)
>
> + XLByteToSeg(targetLSN, restartSeg, wal_segment_size);
> + if (max_slot_wal_keep_size_mb >= 0 && currSeg <=
> restartSeg + limitSegs)
> + {
>
> You use limitSegs here but shouldn't we use keepSeg instead? Actually
> I've commented this point for v6 patch before[1], and this had been
> fixed in the v7 patch. However you're using limitSegs again from v8
> patch again. I might be missing something though.

No. keepSegs is the number of segments *actually* kept around. So
reverting it to keptSegs just resurrects the bug you pointed
upthread. What needed here is at most how many segments will be
kept. So raising limitSegs by wal_keep_segments fixes that.
Sorry for the sequence of silly bugs. TAP test for the case
added.

> Changed the status to 'Waiting on Author'.
>
> [1] https://www.postgresql.org/message-id/CAD21AoD0rChq7wQE%3D_o95quopcQGjcVG9omwdH07nT5cm81hzg%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/20180904.195250.144186960.horiguchi.kyotaro%40lab.ntt.co.jp

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v10-0001-Add-WAL-relief-vent-for-replication-slots.patch text/x-patch 6.8 KB
v10-0002-Add-monitoring-aid-for-max_slot_wal_keep_size.patch text/x-patch 12.3 KB
v10-0003-TAP-test-for-the-slot-limit-feature.patch text/x-patch 6.8 KB
v10-0004-Documentation-for-slot-limit-feature.patch text/x-patch 4.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2018-10-25 13:03:51 Re: Using old master as new replica after clean switchover
Previous Message Alexey Kondratov 2018-10-25 12:37:18 Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line