Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2022-08-06 13:01:08
Message-ID: CAA4eK1KdXirEBrEhU38AHnMpa1NKMM40TBQ7qPdveJ4BS5KtDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 5, 2022 at 7:25 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
>>
>> Why can't it be used to sync the other tables if any?
>
>
> It can be used. But I thought it would be better not to, for example in the following case:
> Let's say a sync worker starts with a table in INIT state. The worker creates a new replication slot to sync that table.
> When sync of the table is completed, it will move to the next one. This time the new table may be in FINISHEDCOPY state, so the worker may need to use the new table's existing replication slot.
> Before the worker will move to the next table again, there will be two replication slots used by the worker. We might want to keep one and drop the other.
> At this point, I thought it would be better to keep the replication slot created by this worker in the first place. I think it's easier to track slots this way since we know how to generate the rep slots name.
> Otherwise we would need to store the replication slot name somewhere too.
>

I think there is some basic flaw in slot reuse design. Currently, we
copy the table by starting a repeatable read transaction (BEGIN READ
ONLY ISOLATION LEVEL REPEATABLE READ) and create a slot that
establishes a snapshot which is first used for copy and then LSN
returned by it is used in the catchup phase after the copy is done.
The patch won't establish such a snapshot before a table copy as it
won't create a slot each time. If this understanding is correct, I
think we need to use ExportSnapshot/ImportSnapshot functionality to
achieve it or do something else to avoid the problem mentioned.

>
>>
>> This sounds reasonable. Let's do this unless we get some better idea.
>
>
>> There is no such restriction that origins should belong to only one
>> table. What makes you think like that?
>
>
> I did not reuse origins since I didn't think it would significantly improve the performance as reusing replication slots does.
> So I just kept the origins as they were, even if it was possible to reuse them. Does that make sense?
>

For small tables, it could have a visible performance difference as it
involves database write operations to each time create and drop the
origin. But if we don't want to reuse then also you need to set its
origin_lsn appropriately. Currently (without this patch), after
creating the slot, we directly use the origin_lsn returned by
create_slot API whereas now it won't be the same case as the patch
doesn't create a slot every time.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2022-08-06 13:01:27 Re: [Code Comments]enum COPY_NEW_FE is removed
Previous Message Michael Paquier 2022-08-06 12:54:36 Re: Fix inconsistencies GUC categories