Re: [HACKERS] Restricting maximum keep segments by repslots

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: jgdr(at)dalibo(dot)com
Cc: andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, sawada(dot)mshk(at)gmail(dot)com, peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2019-12-24 12:26:14
Message-ID: 20191224.212614.633369820509385571.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm very sorry for being late to reply.

At Wed, 2 Oct 2019 17:08:07 +0200, Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote in
> On Tue, 30 Jul 2019 21:30:45 +0900 (Tokyo Standard Time)
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > In "pg_replication_slots" view, the new "wal_status" field is misleading.
> > > Consider this sentence and the related behavior from documentation
> > > (catalogs.sgml):
> > >
> > > <literal>keeping</literal> means that some of them are to be removed by
> > > the next checkpoint.
> > >
> > > "keeping" appears when the current checkpoint will delete some WAL further
> > > than "current_lsn - max_slot_wal_keep_size", but still required by at least
> > > one slot. As some WAL required by some slots will be deleted quite soon,
> > > probably before anyone can react, "keeping" status is misleading here. We
> > > are already in the red zone.
> >
> > It may be "losing", which would be less misleading.
>
> Indeed, "loosing" is a better match for this state.
>
> However, what's the point of this state from the admin point of view? In various
> situation, the admin will have no time to react immediately and fix whatever
> could help.
>
> How useful is this specific state?

If we assume "losing" segments as "lost", a segment once "lost" can
return to "keeping" or "streaming" state. That is intuitively
impossible. On the other hand if we assume it as "keeping", it should
not be removed by the next checkpoint but actually it can be
removed. The state "losing" means such a unstable state different from
both "lost" and "keeping".

> > > I would expect this "wal_status" to be:
> > >
> > > - streaming: slot lag between 0 and "max_wal_size"
> > > - keeping: slot lag between "max_wal_size" and "max_slot_wal_keep_size". the
> > > slot actually protect some WALs from being deleted
> > > - lost: slot lag superior of max_slot_wal_keep_size. The slot couldn't
> > > protect some WAL from deletion
> >
> > I agree that comparing to max_wal_size is meaningful. The revised
> > version behaves as that.
>
> The v16-0006 patch doesn't apply anymore because of commit 709d003fbd. Here is
> the fix:
>
> --- a/src/backend/access/transam/xlogreader.c
> +++ b/src/backend/access/transam/xlogreader.c
> @@ -304,7 +304,7
> - XLByteToSeg(targetPagePtr, targetSegNo, state->wal_segment_size);
> + XLByteToSeg(targetPagePtr, targetSegNo, state->segcxt.ws_segsize);
>
> I suppose you might have more refactoring to do in regard with Alvaro's
> review.
>
> I confirm the new patch behaves correctly in my tests in regard with the
> "wal_status" field.

Thanks for testing. I fixed it in the attached patch.

> + <entry>Availability of WAL records claimed by this
> + slot. <literal>streaming</literal>, <literal>keeping</literal>,
>
> Slots are keeping WALs, not WAL records. Shouldn't it be "Availability of WAL
> files claimed by this slot"?

I choosed "record" since a slot points a record. I'm not sure but I'm
fine with "file". Fixed catalogs.sgml and config.sgml that way.

> + <literal>streaming</literal> means that the claimed records are
> + available within max_wal_size. <literal>keeping</literal> means
>
> I wonder if streaming is the appropriate name here. The WALs required might be
> available for streaming, but the slot not active, thus not "streaming". What
> about merging with the "active" field, in the same fashion as
> pg_stat_activity.state? We would have an enum "pg_replication_slots.state" with
> the following states:
> * inactive: non active slot
> * active: activated, required WAL within max_wal_size
> * keeping: activated, max_wal_size is exceeded but still held by replication
> slots or wal_keep_segments.
> * lost: some WAL are definitely lost
>
> Thoughts?

In the first place, I realized that I was missed a point about the
relationship between max_wal_size and max_slot_wal_keep_size
here. Since the v15 of this patch, GetLsnAvailablity returns
"streaming" when the restart_lsn is within max_wal_size. That behavior
makes sense when max_slot_wal_keep_size > max_wal_size. However, in
the contrary case, restart_lsn could be lost even it is
withinmax_wal_size. So we would see "streaming" (or "normal") even
though restart_lsn is already lost. That is broken.

In short, the "streaming/normal" state is useless if
max_slot_wal_keep_size < max_wal_size.

Finally I used the following wordings.

(there's no "inactive" wal_state)

* normal: required WAL within max_wal_size when max_slot_wal_keep_size
is larger than max_wal_size.
* keeping: required segments are held by replication slots or
wal_keep_segments.

* losing: required segments are about to be removed or may be already
removed but streaming is not dead yet.

* lost: cannot continue streaming using this slot.

> [...]
> > > * "remain" should be NULL if "max_slot_wal_keep_size=-1 or if the slot isn't
> > > active
> >
> > The revised version shows the following statuses.
> >
> > streaming / NULL max_slot_wal_keep_size is -1
> > unkown / NULL mswks >= 0 and restart_lsn is invalid
> > <status> / <bytes> elsewise
>
> Works for me.

Thanks.

> > > * the "lost" status should be a definitive status
> > > * it seems related, but maybe the "wal_status" should be set as "lost"
> > > only when the slot has been deactivate ?
> >
> > Agreed. While replication is active, if required segments seems
> > to be lost once, delayed walreceiver ack can advance restart_lsn
> > to "safe" zone later. So, in the revised version, if the segment
> > for restart_lsn has been removed, GetLsnAvailablity() returns
> > "losing" if walsender is active and "lost" if not.
>
> ok.
>
> > > * logs should warn about a failing slot as soon as it is effectively
> > > deactivated, not before.
> >
> > Agreed. Slots on which walsender is running are exlucded from the
> > output of ReplicationSlotsEnumerateBehnds. As theresult the "some
> > replcation slots lost.." is emitted after related walsender
> > stops.
>
> Once a slot lost WALs and has been deactivated, the following message appears
> during every checkpoints:
>
> WARNING: some replication slots have lost required WAL segments
> DETAIL: Slot slot_limit_st lost 177 segment(s)
>
> I wonder if this is useful to show these messages for slots that were already
> dead before this checkpoint?

Makes sense. I changed KeepLogSeg so that it emits the message only on
slot_names changes.

The attached v17 patch is changed in the follwing points.

- Rebased to the current master.

- Change KeepLogSeg not to emit the message "Slot %s lost %ld
segment(s)" if the slot list is not changed.

- Documentation is fixed following the change of state names.

- Change GetLsnAvailability returns more correct state for wider
situations. It returned a wrong status when max_slot_wal_keep_size
is smaller than max_wal_size, or when max_slot_wal_keep_size is
increased so that the new value covers the restart_lsn of a slot
that have lost required segments in the old setting.

Since it is needed by the above change, I revived
GetOldestXLogFileSegNo() that was removed in v15 as
FindOldestXLogFileSegNo() in a bit different shape.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v17-0001-Add-WAL-relief-vent-for-replication-slots.patch text/x-patch 11.1 KB
v17-0002-Add-monitoring-aid-for-max_slot_wal_keep_size.patch text/x-patch 15.7 KB
v17-0003-Add-primary_slot_name-to-init_from_backup-in-TAP.patch text/x-patch 1.0 KB
v17-0004-TAP-test-for-the-slot-limit-feature.patch text/x-patch 8.1 KB
v17-0005-Documentation-for-slot-limit-feature.patch text/x-patch 5.0 KB
v17-0006-Check-removal-of-in-reading-segment-file.patch text/x-patch 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2019-12-24 12:29:23 Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence
Previous Message Prabhat Sahu 2019-12-24 11:59:25 Server crash with Master-Slave configuration.