Re: [PATCH] Allow to specify restart_lsn in pg_create_physical_replication_slot()

From: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Vyacheslav Makarov <v(dot)makarov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Allow to specify restart_lsn in pg_create_physical_replication_slot()
Date: 2020-06-22 17:18:58
Message-ID: 2cd70350b7ac65dae647d175d62a258c@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-06-19 21:57, Fujii Masao wrote:
> On 2020/06/19 23:20, Alexey Kondratov wrote:
>> On 2020-06-19 03:59, Michael Paquier wrote:
>>> On Thu, Jun 18, 2020 at 03:39:09PM +0300, Vyacheslav Makarov wrote:
>>>> If the WAL segment for the specified restart_lsn (STOP_LSN of the
>>>> backup)
>>>> exists, then the function will create a physical replication slot
>>>> and will
>>>> keep all the WAL segments required by the replica to catch up with
>>>> the
>>>> primary. Otherwise, it returns error, which means that the required
>>>> WAL
>>>> segments have been already utilised, so we do need to take a new
>>>> backup.
>>>> Without passing this newly added parameter
>>>> pg_create_physical_replication_slot() works as before.
>>>>
>>>> What do you think about this?
>
> Currently pg_create_physical_replication_slot() and
> CREATE_REPLICATION_SLOT
> replication command seem to be "idential". So if we add new option into
> one,
> we should add it also into another?
>

I wonder how it could be used via the replication protocol, but probably
this option should be added there as well for consistency.

>
> What happen if future LSN is specified in restart_lsn? With the patch,
> in this case, if the segment at that LSN exists (e.g., because it's
> recycled
> one), the slot seems to be successfully created. However if the LSN is
> far future and the segment doesn't exist, the creation of slot seems to
> fail.
> This behavior looks fragile and confusing. We should accept future LSN
> whether its segment currently exists or not?
>

But what about a possible timeline switch? If we allow specifying it as
further in the future as one wanted, then appropriate segment with
specified LSN may be created in the different timeline if it would be
switched, so it may be misleading. I am not even sure about allowing
future LSN for existing segments, since PITR / timeline switch may occur
just after the slot creation, so the pointer may never be valid. Would
it be better to completely disallow future LSN?

And here I noticed another moment in the patch. TimeLineID of the last
restart/checkpoint is used to detect whether WAL segment file exists or
not. It means that if we try to create a slot just after a timeline
switch, then we could not specify the oldest LSN actually available on
the disk, since it may be from the previous timeline. One can use LSN
only within the current timeline. It seems to be fine, but should be
covered in the docs.

>
> + if (!RecoveryInProgress() && !SlotIsLogical(MyReplicationSlot))
>
> With the patch, the given restart_lsn seems to be ignored during
> recovery.
> Why?
>

I have the same question, not sure that this is needed here. It looks
more like a forgotten copy-paste from ReplicationSlotReserveWal().

>>>
>>> I think that this was discussed in the past (perhaps one of the
>>> threads related to WAL advancing actually?),
>>>
>>
>> I have searched through the archives a bit and found one thread
>> related to slots advancing [1]. It was dedicated to a problem of
>> advancing slots which do not reserve WAL yet, if I get it correctly.
>> Although it is somehow related to the topic, it was a slightly
>> different issue, IMO.
>>
>>>
>>> and this stuff is full of
>>> holes when it comes to think about error handling with checkpoints
>>> running in parallel, potentially doing recycling of segments you
>>> would
>>> expect to be around based on your input value for restart_lsn *while*
>>> pg_create_physical_replication_slot() is still running and
>>> manipulating the on-disk slot information.
>>> ...
>>
>> These are the right concerns, but all of them should be applicable to
>> the pg_create_physical_replication_slot() + immediately_reserve ==
>> true in the same way, doesn't it? I think so, since in that case we
>> are doing a pretty similar thing — trying to reserve some WAL segment
>> that may be concurrently deleted.
>>
>> And this is exactly the reason why ReplicationSlotReserveWal() does it
>> in several steps in a loop:
>>
>> 1. Creates a slot with some restart_lsn.
>> 2. Does ReplicationSlotsComputeRequiredLSN() to prevent removal of the
>> WAL segment with this restart_lsn.
>> 3. Checks that required WAL segment is still there.
>> 4. Repeat if this attempt to prevent WAL removal has failed.
>
> What happens if concurrent checkpoint decides to remove the segment
> at restart_lsn before #2 and then actually removes it after #3?
> The replication slot is successfully created with the given
> restart_lsn,
> but the reserved segment has already been removed?
>

I though about it a bit more and it seems that yes, there is a race even
for a current pg_create_physical_replication_slot() +
immediately_reserve == true, i.e. ReplicationSlotReserveWal(). However,
the chance is very subtle since we take a current GetRedoRecPtr() there.
Probably one could reproduce it with wal_keep_segments = 1 by holding /
releasing backend doing the slot creation and checkpointer with gdb, but
not sure that it is an issue anywhere in the real world.

Maybe I am wrong, but it is not clear for me why current
ReplicationSlotReserveWal() routine does not have that race. I will try
to reproduce it though.

Things get worse when we allow specifying an older LSN, since it has a
higher chances to be at the horizon of deletion by checkpointer. Anyway,
if I get it correctly, with a current patch slot will be created
successfully, but will be obsolete and should be invalidated by the next
checkpoint.

Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-06-22 17:30:43 Re: Default setting for enable_hashagg_disk
Previous Message Ranier Vilela 2020-06-22 16:45:38 Re: Parallel Seq Scan vs kernel read ahead