Re: Replication slot stats misgivings

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-05-07 00:39:56
Message-ID: CAD21AoDcSPT+H53ZcDcUe5pCGvkZHQv1RuVGen9UKFL3+oyvWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> I started to write this as a reply to
> https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de
> but I think it doesn't really fit under that header anymore.
>
> On 2021-03-17 18:51:05 -0700, Andres Freund wrote:
> > It does make it easier for the shared memory stats patch, because if
> > there's a fixed number + location, the relevant stats reporting doesn't
> > need to go through a hashtable with the associated locking. I guess
> > that may have colored my perception that it's better to just have a
> > statically sized memory allocation for this. Noteworthy that SLRU stats
> > are done in a fixed size allocation as well...

Through a long discussion, all review comments pointed out here have
been addressed. I summarized that individual review comments are fixed
by which commit to avoid any confusion later.

>
> As part of reviewing the replication slot stats patch I looked at
> replication slot stats a fair bit, and I've a few misgivings. First,
> about the pgstat.c side of things:
>
> - If somehow slot stat drop messages got lost (remember pgstat
> communication is lossy!), we'll just stop maintaining stats for slots
> created later, because there'll eventually be no space for keeping
> stats for another slot.
>
> - If max_replication_slots was lowered between a restart,
> pgstat_read_statfile() will happily write beyond the end of
> replSlotStats.
>
> - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
> think pgstat.c has absolutely no business doing things on that level.
>
> - We do a linear search through all replication slots whenever receiving
> stats for a slot. Even though there'd be a perfectly good index to
> just use all throughout - the slots index itself. It looks to me like
> slots stat reports can be fairly frequent in some workloads, so that
> doesn't seem great.

Fixed by 3fa17d37716.

>
> - PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData?
>
> - pgstat_report_replslot() already has a lot of stats parameters, it
> seems likely that we'll get more. Seems like we should just use a
> struct of stats updates.

Fixed by cca57c1d9bf.

>
> And then more generally about the feature:
> - If a slot was used to stream out a large amount of changes (say an
> initial data load), but then replication is interrupted before the
> transaction is committed/aborted, stream_bytes will not reflect the
> many gigabytes of data we may have sent.

Fixed by 592f00f8d.

> - I seems weird that we went to the trouble of inventing replication
> slot stats, but then limit them to logical slots, and even there don't
> record the obvious things like the total amount of data sent.

Fixed by f5fc2f5b23d.

>
> I think the best way to address the more fundamental "pgstat related"
> complaints is to change how replication slot stats are
> "addressed". Instead of using the slots name, report stats using the
> index in ReplicationSlotCtl->replication_slots.
>
> That removes the risk of running out of "replication slot stat slots":
> If we loose a drop message, the index eventually will be reused and we
> likely can detect that the stats were for a different slot by comparing
> the slot name.
>
> It also makes it easy to handle the issue of max_replication_slots being
> lowered and there still being stats for a slot - we simply can skip
> restoring that slots data, because we know the relevant slot can't exist
> anymore. And we can make the initial pgstat_report_replslot() during
> slot creation use a
>
> I'm wondering if we should just remove the slot name entirely from the
> pgstat.c side of things, and have pg_stat_get_replication_slots()
> inquire about slots by index as well and get the list of slots to report
> stats for from slot.c infrastructure.

We fixed the problem of "running out of replication slot stat slots"
by using HTAB to store slot stats, see 3fa17d37716. The slot stats
could be orphaned if a slot drop message gets lost. But we constantly
check and remove them in pgstat_vacuum_stat().

For the record, there are two known issues that are unlikely to happen
in practice or don't affect users much: (1) if the messages for
creation and drop slot of the same name get lost and create happens
before (auto)vacuum cleans up the dead slot, the stats will be
accumulated into the old slot, and (2) we could miss the total_txn and
total_bytes updates if logical decoding is interrupted.

For (1), there is an idea of having OIDs for each slot to avoid the
accumulation of stats but that doesn't seem worth doing as in practice
this won't happen frequently. For (2), there are some ideas of
reporting slot stats at releasing slot (by keeping stats in
ReplicationSlot or by using callback) but we decided to go with
reporting slot stats after every stream/spill. Because it covers most
cases in practice and is much simpler than other approaches.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-07 01:43:32 Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
Previous Message Jeff Davis 2021-05-07 00:24:25 Re: alter table set TABLE ACCESS METHOD