Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Ian Lawrence Barwick <barwick(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2022-11-04 02:47:15
Message-ID: CAB8KJ=j=uvrKL+B3oYbU0ft_RDWoDRruPgmpWhnW4u6kaxfyzA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2022年8月5日(金) 22:55 Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>:
>
> Hi Amit,
>
>> >> Why after step 4, do you need to drop the replication slot? Won't just
>> >> clearing the required info from the catalog be sufficient?
>> >
>> >
>> > The replication slots that we read from the catalog will not be used for anything else after we're done with syncing the table which the rep slot belongs to.
>> > It's removed from the catalog when the sync is completed and it basically becomes a slot that is not linked to any table or worker. That's why I think it should be dropped rather than left behind.
>> >
>> > Note that if a worker dies and its replication slot continues to exist, that slot will only be used to complete the sync process of the one table that the dead worker was syncing but couldn't finish.
>> > When that particular table is synced and becomes ready, the replication slot has no use anymore.
>> >
>>
>> Why can't it be used to sync the other tables if any?
>
>
> It can be used. But I thought it would be better not to, for example in the following case:
> Let's say a sync worker starts with a table in INIT state. The worker creates a new replication slot to sync that table.
> When sync of the table is completed, it will move to the next one. This time the new table may be in FINISHEDCOPY state, so the worker may need to use the new table's existing replication slot.
> Before the worker will move to the next table again, there will be two replication slots used by the worker. We might want to keep one and drop the other.
> At this point, I thought it would be better to keep the replication slot created by this worker in the first place. I think it's easier to track slots this way since we know how to generate the rep slots name.
> Otherwise we would need to store the replication slot name somewhere too.
>
>
>>
>> This sounds reasonable. Let's do this unless we get some better idea.
>
>
> I updated the patch to use an unique id for replication slot names and store the last used id in the catalog.
> Can you look into it again please?
>
>
>> There is no such restriction that origins should belong to only one
>> table. What makes you think like that?
>
>
> I did not reuse origins since I didn't think it would significantly improve the performance as reusing replication slots does.
> So I just kept the origins as they were, even if it was possible to reuse them. Does that make sense?

Hi

cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.

[1] http://cfbot.cputube.org/patch_40_3784.log

Thanks

Ian Barwick

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Lawrence Barwick 2022-11-04 02:49:46 Re: Skipping schema changes in publication
Previous Message Ian Lawrence Barwick 2022-11-04 01:38:46 Re: Reducing power consumption on idle servers