Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-04-05 06:15:50
Message-ID: CAA4eK1JEfUqQbSE31FGwx_BBEdA1_FEWis2o1z4VP2cpp8obZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 5, 2022 at 8:17 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are my comments for the latest patch v6-0002.
>
> PATCH v6-0002 comments
> ======================
>
> 2.1 General - should this be an independent patch?
>

I think both the patches are dependent and might get committed
together if the concept proved to be useful and doesn't have flaws.

> In many ways, I think most of this patch is unrelated to the other
> "local_only" patch (v6-0001).
>
> For example, IIUC even in the current HEAD, we could consider it to be
> a user error if multiple SUBSCRIPTIONS or multiple PUBLICATIONS of the
> same SUBSCRIPTION are replicating to the same TABLE on the same node
> and using "copy_data = on".
>
> So I think it would be ok to throw an ERROR if such a copy_data clash
> is detected, and then the user will have to change to use "copy_data =
> off" for some/all of them to avoid data duplication.
>
> The "local_only" option only adds some small logic to this new ERROR,
> but it's not really a prerequisite at all.
>
> e.g. this whole ERROR part of the patch can be a separate thread.
>
> ~~~
>
> 2.2 General - can we remove the "force" enum?
>
> Now, because I consider the clashing "copy_data = on" ERROR to be a
> user error, I think that is something that the user can already take
> care of themselves just using the copy_data = off.
>
> I did not really like the modifying of the "copy_data" option from
> just boolean to some kind of hybrid boolean + "force".
>
> a) It smells a bit off to me. IMO replication is supposed to end up
> with the same (replicated) data on the standby machine but this
> "force" mode seems to be just helping the user to break that concept
> and say - "I know what I'm doing, and I don't care if I get lots of
> duplicated data in the replica table - just let me do it"...
>
> b) It also somehow feels like the new "force" was introduced mostly to
> make the code ERROR handling implementation simpler, rather than to
> make the life of the end-user better. Yes, if force is removed maybe
> the copy-clash-detection-code will need to be internally quite more
> complex than it is now, but that is how it should be, instead of
> putting extra burden on the user (e.g. by complicating the PG docs and
> giving them yet more alternatives to configure). I think any clashing
> copy_data options really is a user error, but also I think the current
> boolean copy_data true/false already gives the user a way to fix it.
>
> c) Actually, your new error hint messages are similar to my
> perspective: They already say "Use CREATE/ALTER SUBSCRIPTION with
> copy_data = off or force". All I am saying is remove the "force", and
> the user can still fix the problem just by using "copy_data = off"
> appropriately.
>

How will it work if there is more than one table? copy_data = "off"
means it won't copy any of the other tables in the corresponding
publication(s). I think it is primarily to give the user some option
where she understands the logical replication setup better and
understands that this won't be a problem.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2022-04-05 07:08:16 Re: Skipping logical replication transactions on subscriber side
Previous Message Michael Paquier 2022-04-05 06:13:41 Re: [PATCH] Expose port->authn_id to extensions and triggers