Re: Replication slot stats misgivings

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-03-23 06:09:23
Message-ID: CAA4eK1+GW98sKkooQT1en4EU=Gaugg-5Yi57FNDmvsY60McAFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 22, 2021 at 12:20 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > - If max_replication_slots was lowered between a restart,
> > > pgstat_read_statfile() will happily write beyond the end of
> > > replSlotStats.
> >
> > I think we cannot restart the server after lowering
> > max_replication_slots to a value less than the number of replication
> > slots actually created on the server. No?
>
> This problem happens in the case where max_replication_slots is
> lowered and there still are stats for a slot.
>

I think this can happen only if the drop message is lost, right?

> I understood the risk of running out of replSlotStats. If we use the
> index in replSlotStats instead, IIUC we need to somehow synchronize
> the indexes in between replSlotStats and
> ReplicationSlotCtl->replication_slots. The order of replSlotStats is
> preserved across restarting whereas the order of
> ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by
> StartupReplicationSlots() doesn’t guarantee the order of the returned
> entries in the directory). Maybe we can compare the slot name in the
> received message to the name in the element of replSlotStats. If they
> don’t match, we swap entries in replSlotStats to synchronize the index
> of the replication slot in ReplicationSlotCtl->replication_slots and
> replSlotStats. If we cannot find the entry in replSlotStats that has
> the name in the received message, it probably means either it's a new
> slot or the previous create message is dropped, we can create the new
> stats for the slot. Is that what you mean, Andres?
>

I wonder how in this scheme, we will remove the risk of running out of
'replSlotStats' and still restore correct stats assuming the drop
message is lost? Do we want to check after restoring each slot info
whether the slot with that name exists?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-23 06:31:25 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Masahiro Ikeda 2021-03-23 05:54:34 Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.