Replication slot stats misgivings

From: Andres Freund <andres(at)anarazel(dot)de>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Replication slot stats misgivings
Date: 2021-03-19 18:52:47
Message-ID: 20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I started to write this as a reply to
https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de
but I think it doesn't really fit under that header anymore.

On 2021-03-17 18:51:05 -0700, Andres Freund wrote:
> It does make it easier for the shared memory stats patch, because if
> there's a fixed number + location, the relevant stats reporting doesn't
> need to go through a hashtable with the associated locking. I guess
> that may have colored my perception that it's better to just have a
> statically sized memory allocation for this. Noteworthy that SLRU stats
> are done in a fixed size allocation as well...

As part of reviewing the replication slot stats patch I looked at
replication slot stats a fair bit, and I've a few misgivings. First,
about the pgstat.c side of things:

- If somehow slot stat drop messages got lost (remember pgstat
communication is lossy!), we'll just stop maintaining stats for slots
created later, because there'll eventually be no space for keeping
stats for another slot.

- If max_replication_slots was lowered between a restart,
pgstat_read_statfile() will happily write beyond the end of
replSlotStats.

- pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I
think pgstat.c has absolutely no business doing things on that level.

- We do a linear search through all replication slots whenever receiving
stats for a slot. Even though there'd be a perfectly good index to
just use all throughout - the slots index itself. It looks to me like
slots stat reports can be fairly frequent in some workloads, so that
doesn't seem great.

- PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData?

- pgstat_report_replslot() already has a lot of stats parameters, it
seems likely that we'll get more. Seems like we should just use a
struct of stats updates.

And then more generally about the feature:
- If a slot was used to stream out a large amount of changes (say an
initial data load), but then replication is interrupted before the
transaction is committed/aborted, stream_bytes will not reflect the
many gigabytes of data we may have sent.
- I seems weird that we went to the trouble of inventing replication
slot stats, but then limit them to logical slots, and even there don't
record the obvious things like the total amount of data sent.

I think the best way to address the more fundamental "pgstat related"
complaints is to change how replication slot stats are
"addressed". Instead of using the slots name, report stats using the
index in ReplicationSlotCtl->replication_slots.

That removes the risk of running out of "replication slot stat slots":
If we loose a drop message, the index eventually will be reused and we
likely can detect that the stats were for a different slot by comparing
the slot name.

It also makes it easy to handle the issue of max_replication_slots being
lowered and there still being stats for a slot - we simply can skip
restoring that slots data, because we know the relevant slot can't exist
anymore. And we can make the initial pgstat_report_replslot() during
slot creation use a

I'm wondering if we should just remove the slot name entirely from the
pgstat.c side of things, and have pg_stat_get_replication_slots()
inquire about slots by index as well and get the list of slots to report
stats for from slot.c infrastructure.

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-03-19 19:00:47 Re: Do we work with LLVM 12 on s390x?
Previous Message Matthias van de Meent 2021-03-19 18:42:08 Re: non-HOT update not looking at FSM for large tuple update