Re: Replication slot drop message is sent after pgstats shutdown.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication slot drop message is sent after pgstats shutdown.
Date: 2021-08-31 03:45:30
Message-ID: 20210831034530.notpifrvnlouhoti@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-08-31 11:37:08 +0900, Masahiko Sawada wrote:
> At step #2, wal sender waits for another transaction started at step
> #1 to complete after creating the replication slot. When the server is
> stopping, wal sender process drops the slot on releasing the slot
> since it's still RS_EPHEMERAL. Then, after dropping the slot we report
> the message for dropping the slot (see ReplicationSlotDropPtr()).
> These are executed in ReplicationSlotRelease() called by ProcKill()
> which is called during calling on_shmem_exit callbacks, which is after
> shutting down pgstats during before_shmem_exit callbacks. I’ve not
> tested yet but I think this can potentially happen also when dropping
> a temporary slot. ProcKill() also calls ReplicationSlotCleanup() to
> clean up temporary slots.
>
> There are some ideas to fix this issue but I don’t think it’s a good
> idea to move either ProcKill() or the slot releasing code to
> before_shmem_exit in this case, like we did for other similar
> issues[1][2].

Yea, that's clearly not an option.

I wonder why the replication slot stuff is in ProcKill() rather than its
own callback. That's probably my fault, but I don't remember what lead
to that.

> Reporting the slot dropping message on dropping the slot
> isn’t necessarily essential actually since autovacuums periodically
> check already-dropped slots and report to drop the stats. So another
> idea would be to move pgstat_report_replslot_drop() to a higher layer
> such as ReplicationSlotDrop() and ReplicationSlotsDropDBSlots() that
> are not called during callbacks. The replication slot stats are
> dropped when it’s dropped via commands such as
> pg_drop_replication_slot() and DROP_REPLICATION_SLOT. On the other
> hand, for temporary slots and ephemeral slots, we rely on autovacuums
> to drop their stats. Even if we delay to drop the stats for those
> slots, pg_stat_replication_slots don’t show the stats for
> already-dropped slots.

Yea, we could do that, but I think it'd be nicer to find a bit more
principled solution...

Perhaps moving this stuff out from ProcKill() into its own
before_shmem_exit() callback would do the trick?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-08-31 03:45:39 Re: Added schema level support for publication.
Previous Message vignesh C 2021-08-31 03:40:52 Re: Added schema level support for publication.