Re: [HACKERS] Restricting maximum keep segments by repslots

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: jgdr(at)dalibo(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, sawada(dot)mshk(at)gmail(dot)com, peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2020-04-06 18:58:39
Message-ID: 20200406185839.GA12500@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-Apr-06, Kyotaro Horiguchi wrote:

> At Fri, 3 Apr 2020 20:14:03 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in

> Agreed and thanks for the code. The patch is enough to express the
> intention. I fixed some compilation errors and made a clean up of
> KeepLogSeg. InvalidateObsoleteReplicationSlots requires the "oldest
> preserved segment" so it should be called before _logSegNo--, not
> after.

Ah, of course, thanks.

> I agree to the aboves. When a slot is invlidated, the following
> message is logged.
>
> LOG: slot rep1 is invalidated at 0/1C00000 due to exceeding max_slot_wal_keep_size

Sounds good. Here's a couple of further adjustments to your v24. This
passes the existing tests (pg_basebackup exception noted below), but I
don't have the updated 019_replslot_limit.pl, so that still needs to be
verified.

First, cosmetic changes in xlog.c.

Second, an unrelated bugfix: ReplicationSlotsComputeLogicalRestartLSN()
is able to return InvalidXLogRecPtr if there's a slot with invalid
restart_lsn. I'm fairly certain that that's bogus. I think this needs
to be backpatched.

Third: The loop in InvalidateObsoleteReplicationSlots was reading
restart_lsn without aquiring mutex. Split the "continue" line in two, so
in_use is checked without spinlock and restart_lsn is checked with it.
This means we also need to store restart_lsn in a local variable before
logging the message (because we don't want to log with spinlock held).
Also, use ereport() not elog() for that, and add quotes to the slot
name.

Lastly, I noticed that we're now changing the slot's restart_lsn to
Invalid without being the slot's owner, which goes counter to what is
said in slot.h:

* - Individual fields are protected by mutex where only the backend owning
* the slot is authorized to update the fields from its own slot. The
* backend owning the slot does not need to take this lock when reading its
* own fields, while concurrent backends not owning this slot should take the
* lock when reading this slot's data.

What this means is that if the slot owner walsender updates the
restart_lsn to a newer value just as we (checkpointer) are trying to set
it to Invalid, the owner's value might persist and our value would be
lost.

AFAICT if we were really stressed about getting this exactly correct,
then we would have to kill the walsender, wait for it to die, then
ReplicationSlotAcquire and *then* update
MyReplicationSlot->data.restart_lsn. But I don't think we want to do
that during checkpoint, and I'm not sure we need to be as strict anyway:
it seems to me that it suffices to check restart_lsn for being invalid
in the couple of places where the slot's owner advances (which is the
two auxiliary functions for ProcessStandbyReplyMessage). I have done so
in the attached. There are other places where the restart_lsn is set,
but those seem to be used only when the slot is created. I don't think
we need to cover for those, but I'm not 100% sure about that.

However, the change in PhysicalConfirmReceivedLocation() breaks
the way slots work for pg_basebackup: apparently the slot is created
with a restart_lsn of Invalid and we only advance it the first time we
process a feedback message from pg_basebackup. I have a vague feeling
that that's bogus, but I'll have to look at the involved code a little
bit more closely to be sure about this.

One last thing: I think we need to ReplicationSlotMarkDirty() and
ReplicationSlotSave() after changing the LSN. My patch doesn't do that.
I noticed that the checkpoint already saved the slot once; maybe it
would make more sense to avoid doubly-writing the files by removing
CheckPointReplicationSlots() from CheckPointGuts, and instead call it
just after doing InvalidateObsoleteReplicationSlots(). But this is not
very important, since we don't expect to be modifying slots because of
disk-space reasons very frequently anyway.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Further-changes.patch text/x-diff 6.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-04-06 19:04:07 Re: Don't try fetching future segment of a TLI.
Previous Message Ashwin Agrawal 2020-04-06 18:51:06 Re: SyncRepLock acquired exclusively in default configuration