Re: Replication slot drop message is sent after pgstats shutdown.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication slot drop message is sent after pgstats shutdown.
Date: 2021-08-31 05:22:39
Message-ID: CAD21AoDHQgSpr11epL44qS=u8pJbBpx6G0E8SKewjDJpsEJtnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 31, 2021 at 12:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2021-08-31 11:37:08 +0900, Masahiko Sawada wrote:
> > At step #2, wal sender waits for another transaction started at step
> > #1 to complete after creating the replication slot. When the server is
> > stopping, wal sender process drops the slot on releasing the slot
> > since it's still RS_EPHEMERAL. Then, after dropping the slot we report
> > the message for dropping the slot (see ReplicationSlotDropPtr()).
> > These are executed in ReplicationSlotRelease() called by ProcKill()
> > which is called during calling on_shmem_exit callbacks, which is after
> > shutting down pgstats during before_shmem_exit callbacks. I’ve not
> > tested yet but I think this can potentially happen also when dropping
> > a temporary slot. ProcKill() also calls ReplicationSlotCleanup() to
> > clean up temporary slots.
> >
> > There are some ideas to fix this issue but I don’t think it’s a good
> > idea to move either ProcKill() or the slot releasing code to
> > before_shmem_exit in this case, like we did for other similar
> > issues[1][2].
>
> Yea, that's clearly not an option.
>
> I wonder why the replication slot stuff is in ProcKill() rather than its
> own callback. That's probably my fault, but I don't remember what lead
> to that.
>
>
> > Reporting the slot dropping message on dropping the slot
> > isn’t necessarily essential actually since autovacuums periodically
> > check already-dropped slots and report to drop the stats. So another
> > idea would be to move pgstat_report_replslot_drop() to a higher layer
> > such as ReplicationSlotDrop() and ReplicationSlotsDropDBSlots() that
> > are not called during callbacks. The replication slot stats are
> > dropped when it’s dropped via commands such as
> > pg_drop_replication_slot() and DROP_REPLICATION_SLOT. On the other
> > hand, for temporary slots and ephemeral slots, we rely on autovacuums
> > to drop their stats. Even if we delay to drop the stats for those
> > slots, pg_stat_replication_slots don’t show the stats for
> > already-dropped slots.
>
> Yea, we could do that, but I think it'd be nicer to find a bit more
> principled solution...
>
> Perhaps moving this stuff out from ProcKill() into its own
> before_shmem_exit() callback would do the trick?

You mean to move only the part of sending the message to its own
before_shmem_exit() callback? or move ReplicationSlotRelease() and
ReplicationSlotCleanup() from ProcKill() to it?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2021-08-31 05:28:35 Re: Fix around conn_duration in pgbench
Previous Message Greg Nancarrow 2021-08-31 05:20:31 Re: Added schema level support for publication.