Re: row filtering for logical replication

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-17 06:45:58
Message-ID: CAJcOf-dUYB-ohyB8D8iODie6YxWQn3cwvhdAY=55yi5r=FW1ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 17, 2021 at 9:41 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> PSA the v47* patch set.
>

I found that even though there are now separately-maintained WHERE clauses
per pubaction, there still seem to be problems when applying the old/new
row rules for UPDATE.
A simple example of this was previously discussed in [1].
The example is repeated below:

---- Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');

---- Subscription
create table tbl1 (a int primary key, b int);
create subscription sub connection 'dbname=postgres host=localhost
port=10000' publication A,B;

---- Publication
insert into tbl1 values (1,1);
update tbl1 set a = 2;

So using the v47 patch-set, I still find that the UPDATE above results in
publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.

+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE

This is because the old row (1,1) doesn't match the UPDATE filter "(a>1)",
but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be
assumed that (1,1) was never published (and thus requires an INSERT rather
than UPDATE) based on these checks, because in this example, (1,1) was
previously published via a different operation - INSERT (and using a
different filter too).
I think the fundamental problem here is that these UPDATE rules assume that
the old (current) row was previously UPDATEd (and published, or not
published, according to the filter applicable to UPDATE), but this is not
necessarily the case.
Or am I missing something?

----
[1]
https://postgr.es/m/CAJcOf-dz0srExG0NPPgXh5X8eL2uxk7C=cZoGTbf8cNqoRUY6w@mail.gmail.com

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-12-17 06:46:02 Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Previous Message Peter Eisentraut 2021-12-17 06:42:58 Re: Emit a warning if the extension's GUC is set incorrectly