Re: Synchronizing slots from primary to standby

From: Andres Freund <andres(at)anarazel(dot)de>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2022-02-07 20:32:22
Message-ID: 20220207203222.22aktwxrt3fcllru@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-07 13:38:38 +0530, Ashutosh Sharma wrote:
> Are you talking about this scenario - what if the logical replication
> slot on the publisher is dropped, but is being referenced by the
> standby where the slot is synchronized?

It's a bit hard to say, because neither in this thread nor in the patch I've
found a clear description of what the syncing needs to & tries to
guarantee. It might be that that was discussed in one of the precursor
threads, but...

Generally I don't think we can permit scenarios where a slot can be in a
"corrupt" state, i.e. missing required catalog entries, after "normal"
administrative commands (i.e. not mucking around in catalog entries / on-disk
files). Even if the sequence of commands may be a bit weird. All such cases
need to be either prevented or detected.

As far as I can tell, the way this patch keeps slots on physical replicas
"valid" is solely by reorderbuffer.c blocking during replay via
wait_for_standby_confirmation().

Which means that if e.g. the standby_slot_names GUC differs from
synchronize_slot_names on the physical replica, the slots synchronized on the
physical replica are not going to be valid. Or if the primary drops its
logical slots.

> Should the redo function for the drop replication slot have the capability
> to drop it on standby and its subscribers (if any) as well?

Slots are not WAL logged (and shouldn't be).

I think you pretty much need the recovery conflict handling infrastructure I
referenced upthread, which recognized during replay if a record has a conflict
with a slot on a standby. And then ontop of that you can build something like
this patch.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-02-07 20:45:57 Re: Synchronizing slots from primary to standby
Previous Message Alexander Korotkov 2022-02-07 20:20:09 Re: [PATCH] nodeindexscan with reorder memory leak