Quick Links

Re: Physical replication slot advance is not persistent

From:	Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, michael(at)paquier(dot)xyz, simon(at)2ndquadrant(dot)com
Subject:	Re: Physical replication slot advance is not persistent
Date:	2020-01-16 17:09:09
Message-ID:	298e5b24-2628-3be0-840c-9ed6a45127ed@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 09.01.2020 09:36, Kyotaro Horiguchi wrote:
> Hello.
>
> At Sun, 29 Dec 2019 15:12:16 +0300, Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru> wrote in
>> On 2019-12-26 16:35, Alexey Kondratov wrote:
>>> Another concern is that ReplicationSlotIsDirty is added with the only
>>> one user. It also cannot be used by SaveSlotToPath due to the
>>> simultaneous usage of both flags dirty and just_dirtied there.
>>> In that way, I hope that we should call ReplicationSlotSave
>>> unconditionally in the pg_replication_slot_advance, so slot will be
>>> saved or not automatically based on the slot->dirty flag. In the same
>>> time, ReplicationSlotsComputeRequiredXmin and
>>> ReplicationSlotsComputeRequiredLSN should be called by anyone, who
>>> modifies xmin and LSN fields in the slot. Otherwise, currently we are
>>> getting some leaky abstractions.
> Sounds reasonable.

Great, so it seems that we have reached some agreement about who should
mark slot as dirty, at least for now.

>
>> If someone will utilise old WAL and after that crash will happen
>> between steps 2) and 3), then we start with old value of restart_lsn,
>> but without required WAL. I do not know how to properly reproduce it
>> without gdb and power off, so the chance is pretty low, but still it
>> could be a case.
> In the first place we advance required LSN for every reply message but
> save slot data only at checkpoint on physical repliation. Such a
> strict guarantee seems too much.
>
> ...
>
> I think we shouldn't touch the paths used by replication protocol. And
> don't we focus on how we make a change of a replication slot from SQL
> interface persistent? It seems to me that generaly we don't need to
> save dirty slots other than checkpoint, but the SQL function seems
> wanting the change to be saved immediately.
>
> As the result, please find the attached, which is following only the
> first paragraph cited above.

OK, I have definitely overthought that, thanks. This looks like a
minimal subset of changes that actually solves the bug. I would only
prefer to keep some additional comments (something like the attached),
otherwise after half a year it will be unclear again, why we save slot
unconditionally here.

Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Attachment	Content-Type	Size
v5-0001-Make-physical-slot-advance-to-be-persistent.diff	text/x-patch	1.6 KB

In response to

Re: Physical replication slot advance is not persistent at 2020-01-09 06:36:07 from Kyotaro Horiguchi

Responses

Re: Physical replication slot advance is not persistent at 2020-01-20 06:45:40 from Michael Paquier
Re: Physical replication slot advance is not persistent at 2020-01-21 01:44:12 from Craig Ringer

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2020-01-16 17:14:55	Re: our checks for read-only queries are not great
Previous Message	Tomas Vondra	2020-01-16 17:04:32	Re: SlabCheck leaks memory into TopMemoryContext