Re: Copy function for logical replication slots

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Copy function for logical replication slots
Date: 2018-08-28 07:14:04
Message-ID: CAD21AoDPwxBEyv1d7dvihY=bQ9zqFdputbkfOidMCe60T6+nfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 20, 2018 at 2:53 PM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Tue, Aug 14, 2018 at 01:38:23PM +0900, Masahiko Sawada wrote:
>> On Thu, Jul 12, 2018 at 9:28 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> Attached new version of patch incorporated the all comments I got from
>>> Michael-san.
>>>
>>> To prevent the WAL segment file of restart_lsn of the origin slot from
>>> removal during creating the target slot, I've chosen a way to copy new
>>> one while holding the origin one. One problem to implement this way is
>>> that the current replication slot code doesn't allow us to do
>>> straightforwardly; the code assumes that the process creating a new
>>> slot is not holding another slot. So I've changed the copy functions
>>> so that it save temporarily MyReplicationSlot and then restore and
>>> release it after creation the target slot. To ensure that both the
>>> origin and target slot are released properly in failure cases I used
>>> PG_ENSURE_ERROR_CLEANUP(). That way, we can keep the code changes of
>>> the logical decoding at a minimum. I've thought that we can change the
>>> logical decoding code so that it can assumes that the process can have
>>> more than one slots at once but it seems overkill to me.
>
> Yeah, we may be able to live with this trick. For other processes, the
> process doing the copy would be seen as holding the slot so the
> checkpointer would not advance the oldest LSN to retain.
>
>> The previous patch conflicts with the current HEAD. Attached updated
>> version patch.

Thank you for reviewing this patch.

>
> +-- Now we have maximum 8 replication slots. Check slots are properly
> +-- released even when raise error during creating the target slot.
> +SELECT 'copy' FROM pg_copy_logical_replication_slot('orig_slot1',
> 'failed'); -- error
> +ERROR: all replication slots are in use
>
> installcheck is going to fail on an instance which does not use exactly
> max_replication_slots = 8. That lacks flexibility, and you could have
> the same coverage by copying, then immediately dropping each new slot.

Will fix.

> + to a physical replicaiton slot named <parameter>dst_slot_name</parameter>.
> s/replicaiton/replicaton/.
>
> + The copied physical slot starts to reserve WAL from the same
> <acronym>LSN</acronym> as the
> + source slot if the source slot already reserves WAL.
> Or restricting this case? In what is it useful to allow a copy from a
> slot which has done nothing yet? This would also simplify the acquire
> and release logic of the source slot.
>

I think the copying from a slot that already reserved WAL would be
helpful for backup cases (maybe you suggested?). Also, either way we
need to make a safe logic of acquring and releasing the source slot
for logical slots cases. Or you meant to restrict the case where the
copying a slot that doesn't reserve WAL?

> + /* Check type of replication slot */
> + if (SlotIsLogical(MyReplicationSlot))
> + {
> + ReplicationSlotRelease();
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + (errmsg("cannot copy a logical replication slot to a
> physical replication slot"))));
> + }
> No need to call ReplicationSlotRelease for an ERROR code path.
>

Will fix.

> Does it actually make sense to allow copy of temporary slots or change
> their persistence? Those don't live across sessions so they'd need to
> be copied in the same session which created them.

I think the copying of temporary slots would be an impracticable
feature but the changing their persistence might be helpful for some
cases, especially copying from persistent to temporary.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-08-28 07:52:13 Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
Previous Message Fabien COELHO 2018-08-28 06:58:59 Re: some more error location support