Re: Review for GetWALAvailability()

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: alvherre(at)2ndquadrant(dot)com
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Review for GetWALAvailability()
Date: 2020-06-17 01:17:07
Message-ID: 20200617.101707.1735599255100002667.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 16 Jun 2020 14:31:43 -0400, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in
> On 2020-Jun-16, Kyotaro Horiguchi wrote:
>
> > I noticed the another issue. If some required WALs are removed, the
> > slot will be "invalidated", that is, restart_lsn is set to invalid
> > value. As the result we hardly see the "lost" state.
> >
> > It can be "fixed" by remembering the validity of a slot separately
> > from restart_lsn. Is that worth doing?
>
> We discussed this before. I agree it would be better to do this
> in some way, but I fear that if we do it naively, some code might exist
> that reads the LSN without realizing that it needs to check the validity
> flag first.

Yes, that was my main concern on it. That's error-prone. How about
remembering the LSN where invalidation happened? It's safe since no
others than slot-monitoring functions would look
last_invalidated_lsn. It can be reset if active_pid is a valid pid.

InvalidateObsoleteReplicationSlots:
...
SpinLockAcquire(&s->mutex);
+ s->data.last_invalidated_lsn = s->data.restart_lsn;
s->data.restart_lsn = InvalidXLogRecPtr;
SpinLockRelease(&s->mutex);

> On the other hand, maybe this is not a problem in practice, because if
> such a bug occurs, what will happen is that trying to read WAL from such
> a slot will return the error message that the WAL file cannot be found.
> Maybe this is acceptable?

I'm not sure. For my part a problem of that would we need to look
into server logs to know what is acutally going on.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-06-17 01:43:58 Re: language cleanups in code and docs
Previous Message Melanie Plageman 2020-06-17 01:15:50 Re: Improve planner cost estimations for alternative subplans