Re: Logical replication - schema change not invalidating the relation cache

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Logical replication - schema change not invalidating the relation cache
Date: 2023-01-05 12:02:12
Message-ID: CALDaNm3LPJR_4QiWoVF-SdqrEA6P82to+U1c2=x6-4bAPT4mQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 5 Jan 2023 at 03:17, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> vignesh C <vignesh21(at)gmail(dot)com> writes:
> > [ v3-0001-Fix-for-invalidating-logical-replication-relation.patch ]
>
> (btw, please don't send multiple patch versions with the same number,
> it's very confusing.)

Since it was just rebasing on top of HEAD, I did not change the
version, I will take care of this point in the later versions.

> I looked briefly at this patch. I wonder why you wrote a whole new
> callback function instead of just using rel_sync_cache_publication_cb
> for this case too.

Yes we can use rel_sync_cache_publication_cb itself for the
invalidation of the relations, I have changed it.

> The bigger picture here though is that in examples such as the one
> you gave at the top of the thread, it's not very clear to me that
> there's *any* principled behavior. If the connection between publisher
> and subscriber tables is only the relation name, fine ... but exactly
> which relation name applies? If you've got a transaction that is both
> inserting some data and renaming the table, it's really debatable which
> insertions should be sent under which name(s). So how much should we
> actually care about such cases? Do we really want to force a cache flush
> any time somebody changes a (possibly unrelated) pg_namespace entry?
> We could be giving up significant performance and not accomplishing much
> except changing from one odd behavior to a different one.

The connection between publisher and subscriber table is based on
relation id, During the first change relid, relname and schema name
from publisher will be sent to the subscriber. Subscriber stores these
id, relname and schema name in the LogicalRepRelMap hash for which
relation id is the key. Subsequent data received in the subscriber
will use the relation id received from the publisher and apply the
changes in the subscriber.
The problem does not stop even after the transaction that renames the
schema is completed(Step3 in first mail). Even after the transaction
is completed i.e after Step 3 the inserts of sch1.t1 and sch2.t1 both
get replicated to sch1.t1 in the subscriber side. This happens because
the publisher id's of sch2.t1 and sch1.t1 are mapped to sch1.t1 in the
subscriber side and both inserts are successful.
Step4) In Publisher
postgres=# insert into sch2.t1 values(11);
INSERT 0 1
postgres=# insert into sch1.t1 values(12);
INSERT 0 1

Step5) In Subscriber
postgres=# select * from sch1.t1;
c1
----
11
12
(2 rows)

During the sch1.t1 first insertion the relid, relname and schema name
from publisher will be sent to the subscriber, this entry will be
mapped to sch1.t1 in subscriber side and any insert from the publisher
will insert to sch1.t1.
After the rename of schema(relid will not be changed) since this entry
is not invalidated, even though we are inserting to sch2.t1 as the
relid is not changed, subscriber will continue to insert into sch1.t1
in subscriber.
During the first insert of new table sch1.t1, the relid, relname and
schema name from publisher will be sent to the subscriber, this entry
will be again mapped to sch1.t1 in the subscriber side.
Since both the entries sch1.t1 and sch2.t1 are mapped to sch1.t1 in
the subscriber side, both inserts will insert to the same table.
This issue will get fixed if we invalidate the relation and update the
relmap in the subscriber.
I did not like the behavior where any insert on sch1.t1 or sch2.t1
replicates the changes to sch1.t1 in the subscriber. I felt it might
be better to fix this issue. I agree that the invalidations are
costly. If you feel this is a very corner case then we can skip it.

Attached an updated patch.

Regards,
Vignesh

Attachment Content-Type Size
v4-0001-Fix-for-invalidating-logical-replication-relation.patch text/x-patch 4.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2023-01-05 12:11:43 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()
Previous Message Dilip Kumar 2023-01-05 11:53:36 Re: Perform streaming logical transactions by background workers and parallel apply