Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-05-26 11:10:22
Message-ID: CAA4eK1KS+pJXnTW9c5Eb+OS=u779crS8nRRZcvAKOH7=j0MYog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 20, 2022 at 3:31 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Wed, May 18, 2022 at 4:22 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > 5.
> > * It is quite possible that subscriber has not yet pulled data to
> > + * the tables, but in ideal cases the table data will be subscribed.
> > + * To keep the code simple it is not checked if the subscriber table
> > + * has pulled the data or not.
> > + */
> > + if (copydata == COPY_DATA_ON && local_only && !slot_attisnull(slot, 3))
> >
> > Sorry, but I don't understand what you intend to say by the above
> > comment. Can you please explain?
>
> When the user specifies copy_data as on, we should check if the
> publisher has the publication tables being subscribed from a remote
> publisher. If so throw an error as it remote origin data present.
> Ex:
> Node1 - pub1 for table t1 -- no data
> Node2 - Sub1 subscribing to data from pub1
> Node2 - pub2 for table t1 -- no data
> Node3 - create subscription to Node2 with copy_data = ON
>
> In this case even though the table does not have any remote origin
> data, as Node2 is subscribing to data from Node1, throw an error.
> We throw an error irrespective of data present in Node1 or not to keep
> the code simple.
>

I think we can change the contents of comments something like: "Throw
an error if the publisher has subscribed to the same table from some
other publisher. We cannot differentiate between the local and
non-local data that is present in the HEAP during the initial sync.
Identification of local data can be done only from the WAL by using
the origin id. XXX: For simplicity, we don't check whether the table
has any data or not. If the table doesn't have any data then we don't
need to distinguish between local and non-local data so we can avoid
throwing error in that case."

Few more comments:
==================
Patch 0002
======
1.
+ if (copydata == COPY_DATA_ON && only_local && !slot_attisnull(slot, 3))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("CREATE/ALTER SUBSCRIPTION with only_local and copy_data as
true is not allowed when the publisher might have replicated data,
table:%s.%s might have replicated data in the publisher",
+ nspname, relname),
+ errhint("Use CREATE/ALTER SUBSCRIPTION with copy_data = off or force"));

Can we split the error message? errmsg: table:%s.%s has replicated
data in the publisher; errdetail:CREATE/ALTER SUBSCRIPTION with
only_local and copy_data as true is not allowed when the publisher has
replicated data

2.
+ <para>
+ Lock the required tables in the new node until the setup is complete.
+ </para>
+ <para>
+ Create subscriptions on existing nodes pointing to publication on
+ the new node with <literal>only_local</literal> option specified as
+ <literal>on</literal> and <literal>copy_data</literal> specified as
+ <literal>on</literal>.
+ </para>
+ <para>
+ Wait for data to be copied from the new node to existing nodes.
+ </para>
+ <para>
+ Alter the publication in new node so that the truncate operation is not
+ replicated to the subscribers.
+ </para>

Here and at other places, we should specify that the lock mode should
to acquire the lock on table should be EXCLUSIVE so that no concurrent
DML is allowed on it. Also, it is better if somewhere we explain why
and which nodes need locks?

Patch 0001:
==========
1.
+$node_A->append_conf(
+ 'postgresql.conf', qq(
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+));

I don't see why you need to set these parameters. There is no test
case that needs these parameters. Please remove these from here and
all other similar places in 032_onlylocal.pl.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2022-05-26 11:25:52 Compare variables of composite type with slightly different column types
Previous Message Roffild 2022-05-26 09:22:35 postgres and initdb not working inside docker