Re: [HACKERS] Restricting maximum keep segments by repslots

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: alvherre(at)2ndquadrant(dot)com
Cc: jgdr(at)dalibo(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, sawada(dot)mshk(at)gmail(dot)com, peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2020-04-01 05:39:22
Message-ID: 20200401.143922.231086326869486790.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 31 Mar 2020 18:01:36 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in
> I noticed some other things:
>
> 1. KeepLogSeg sends a warning message when slots fall behind. To do
> this, it searches for "the most affected slot", that is, the slot that
> lost the most data. But it seems to me that that's a bit pointless; if
> a slot data, it's now useless and anything that was using that slot must
> be recreated. If you only know what's the most affected slot, it's not
> possible to see which *other* slots are affected. It doesn't matter if
> the slot missed one segment or twenty segments or 9999 segments -- the
> slot is now useless, or it is not useless. I think we should list the
> slot that was *least* affected, i.e., the slot that lost the minimum
> amount of segments; then the user knows that all slots that are older
> than that one are *also* affected.

Mmm. v17-0001 patch [1] shows it as the following:

> WARNING: some replication slots have lost required WAL segments
> DETAIL: Slot s1 lost 8 segment(s).
> WARNING: some replication slots have lost required WAL segments
> DETAIL: Slots s1, s2, s3 lost at most 9 segment(s).

And it is removed following a comment as [2] :p

I restored the feature in simpler shape in v22.

[1] https://www.postgresql.org/message-id/flat/20191224.212614.633369820509385571.horikyota.ntt%40gmail.com#cbc193425b95edd166a5c6d42fd579c6
[2] https://www.postgresql.org/message-id/20200123.212854.658794168913258596.horikyota.ntt%40gmail.com

> 2. KeepLogSeg ignores slots that are active. I guess the logic here is
> that if a slot is active, then it'll keep going until it catches up and
> we don't need to do anything about the used disk space. But that seems
> a false premise, because if a standby is so slow that it cannot keep up,
> it will eventually run the master out of diskspace even if it's active
> all the time. So I'm not seeing the reasoning that makes it useful to
> skip checking active slots.

Right. I unconsciously assumed synchronous replication. It should be
removed. Fixed.

> (BTW I don't think you need to keep that many static variables in that
> function. Just the slot name should be sufficient, I think ... or maybe
> even the *pointer* to the slot that was last reported.

Agreed. Fixed.

> I think if a slot is behind and it lost segments, we should kill the
> walsender that's using it, and unreserve the segments. So maybe
> something like

At Tue, 31 Mar 2020 19:07:49 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in
> > I think we should kill(SIGTERM) the walsender using the slot (slot->active_pid),
> > then acquire the slot and set it to some state indicating that it is now
> > useless, no longer reserving WAL; so when the walsender is restarted, it
> > will find the slot cannot be used any longer.
>
> Ah, I see ioguix already pointed this out and the response was that the
> walsender stops by itself. Hmm. I suppose this works too ... it seems
> a bit fragile, but maybe I'm too sensitive. Do we have other opinions
> on this point?

Yes it the check is performed after every block-read so walsender
doesn't seem to send a wrong record. The 0002 added that for
per-record basis so it can be said useless. But things get simpler by
killing such walsenders under a subtle condition, I think.

In the attached, 0002 removed and added walsender-kill code.

> I sense some attempt to salvage slots that are reading a segment
that is
> "outdated" and removed, but for which the walsender has an open file
> descriptor. (This appears to be the "losing" state.) This seems
> dangerous, for example the segment might be recycled and is being
> overwritten with different data. Trying to keep track of that seems
> doomed. And even if the walsender can still read that data, it's only a
> matter of time before the next segment is also removed. So keeping the
> walsender alive is futile; it only delays the inevitable.

Agreed.

The attached is v22, only one patch file.

- 0002 is removed

- I didn't add "unknown" status in wal_status, because it is quite
hard to explain reasonably. Instead, I added the following comment.

+ * Find the oldest extant segment file. We get 1 until checkpoint removes
+ * the first WAL segment file since startup, which causes the status being
+ * wrong under certain abnormal conditions but that doesn't actually harm.

- Changed the message in KeepLogSeg as described above.

- Don't ignore inactive slots in KeepLogSeg.

- Out-of-sync walsenders are killed immediately.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v22-0001-Add-WAL-relief-vent-for-replication-slots.patch text/x-patch 33.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-04-01 06:03:34 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Previous Message Noah Misch 2020-04-01 05:15:04 Re: backup manifests