Re: Physical replication slot advance is not persistent

From: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, andres(at)anarazel(dot)de, dim(at)tapoueh(dot)org, pgsql-hackers(at)postgresql(dot)org, simon(at)2ndquadrant(dot)com
Subject: Re: Physical replication slot advance is not persistent
Date: 2020-06-18 18:46:28
Message-ID: 0a14a258081aea401222498d402d129b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-06-16 10:27, Michael Paquier wrote:
> On Wed, Jun 10, 2020 at 08:57:17PM +0300, Alexey Kondratov wrote:
>> New test reproduces this issue well. Left it running for a couple of
>> hours
>> in repeat and it seems to be stable.
>
> Thanks for testing. I have been thinking about the minimum xmin and
> LSN computations on advancing, and actually I have switched the
> recomputing to be called at the end of pg_replication_slot_advance().
> This may be a waste if no advancing is done, but it could also be an
> advantage to enforce a recalculation of the thresholds for each
> function call. And that's more consistent with the slot copy, drop
> and creation.
>

Sorry for a bit late response, but I see a couple of issues with this
modified version of the patch in addition to the waste call if no
advancing is done, mentioned by you:

1. Both ReplicationSlotsComputeRequiredXmin() and
ReplicationSlotsComputeRequiredLSN() may have already been done in the
LogicalConfirmReceivedLocation() if it was a logical slot. It may be
fine and almost costless to do it twice, but it looks untidy for me.

2. It seems that we do not need ReplicationSlotsComputeRequiredXmin() at
all if it was a physical slot, since we do not modify xmin in the
pg_physical_replication_slot_advance(), doesn't it?

That's why I wanted (somewhere around v5 of the patch in this thread) to
move all dirtying and recomputing calls to the places, where xmin / lsn
slot modifications are actually done —
pg_physical_replication_slot_advance() and
LogicalConfirmReceivedLocation(). LogicalConfirmReceivedLocation()
already does this, so we only needed to teach
pg_physical_replication_slot_advance() to do the same.

However, just noted that LogicalConfirmReceivedLocation() only does
ReplicationSlotsComputeRequiredLSN() if updated_xmin flag was set, which
looks wrong from my perspective, since updated_xmin and updated_restart
flags are set separately.

That way, I would solve this all as per attached, which works well for
me, but definitely worth of a better testing.

Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
slots-advancing.diff text/x-diff 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-06-18 20:00:05 re: BUG #14053: postgres crashes in plpgsql under load of concurrent transactions
Previous Message Andres Freund 2020-06-18 18:30:41 Re: global barrier & atomics in signal handlers (Re: Atomic operations within spinlocks)