Re: Replication slot stats misgivings

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-03-22 02:56:19
Message-ID: CAA4eK1LNCBCsB9b6zXPFnqySZxfaeVjqACs5TgJUAoCCyKRA-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 22, 2021 at 3:10 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2021-03-21 16:08:00 +0530, Amit Kapila wrote:
> > On Sun, Mar 21, 2021 at 2:57 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > On 2021-03-20 10:28:06 +0530, Amit Kapila wrote:
> > > > On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > This idea is worth exploring to address the complaints but what do we
> > > > > do when we detect that the stats are from the different slot? It has
> > > > > mixed of stats from the old and new slot. We need to probably reset it
> > > > > after we detect that.
> > > > >
> > > >
> > > > What if the user created a slot with the same name after dropping the
> > > > slot and it has used the same index. I think chances are less but
> > > > still a possibility, but maybe that is okay.
> > > >
> > > > > What if after some frequency (say whenever we
> > > > > run out of indexes) we check whether the slots we are maintaining is
> > > > > pgstat.c have some stale slot entry (entry exists but the actual slot
> > > > > is dropped)?
> > > > >
> > > >
> > > > A similar drawback (the user created a slot with the same name after
> > > > dropping it) exists with this as well.
> > >
> > > pgstat_report_replslot_drop() already prevents that, no?
> > >
> >
> > Yeah, normally it would prevent that but what if a drop message is lost?
>
> That already exists as a danger, no? pgstat_recv_replslot() uses
> pgstat_replslot_index() to find the slot by name. So if a drop message
> is lost we'd potentially accumulate into stats of an older slot. It'd
> probably a lower risk with what I suggested, because the initial stat
> report slot.c would use something like pgstat_report_replslot_create(),
> which the stats collector can use to reset the stats to 0?
>

okay, but I guess if we miss the create message as well then we will
have a similar danger. I think the benefit your idea will bring is to
use index-based lookup instead of name-based lookup. IIRC, we have
initially used the name here because we thought there is nothing like
OID for slots but your suggestion of using
ReplicationSlotCtl->replication_slots can address that.

> If we do it right the lossiness will be removed via shared memory stats
> patch...
>

Okay.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-03-22 03:01:21 Re: Type of wait events WalReceiverWaitStart and WalSenderWaitForWAL
Previous Message Tatsuo Ishii 2021-03-22 02:36:27 Re: Why logical replication lancher exits 1?