Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-25 04:20:44
Message-ID: CAA4eK1KMmWhRUw-reLxnBw0s40mU8H-oYhapGz_WLD-mb3a7ig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 24, 2021 at 11:04 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> The current PG docs text for CREATE PUBLICATION (in the v54-0001
> patch) has a part that says
>
> + A nullable column in the <literal>WHERE</literal> clause could cause the
> + expression to evaluate to false; avoid using columns without not-null
> + constraints in the <literal>WHERE</literal> clause.
>
> I felt that the caution to "avoid using" nullable columns is too
> strongly worded. AFAIK nullable columns will work perfectly fine so
> long as you take due care of them in the WHERE clause. In fact, it
> might be very useful sometimes to filter on nullable columns.
>
> Here is a small test example:
>
> // publisher
> test_pub=# create table t1 (id int primary key, msg text null);
> test_pub=# create publication p1 for table t1 where (msg != 'three');
> // subscriber
> test_sub=# create table t1 (id int primary key, msg text null);
> test_sub=# CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost
> dbname=test_pub application_name=sub1' PUBLICATION p1;
>
> // insert some data
> test_pub=# insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'),
> (4, null), (5, 'five');
> test_pub=# select * from t1;
> id | msg
> ----+-------
> 1 | one
> 2 | two
> 3 | three
> 4 |
> 5 | five
> (5 rows)
>
> // data at sub
> test_sub=# select * from t1;
> id | msg
> ----+------
> 1 | one
> 2 | two
> 5 | five
> (3 rows)
>
> Notice the row 4 with the NULL is also not replicated. But, perhaps we
> were expecting it to be replicated (because NULL is not 'three'). To
> do this, simply rewrite the WHERE clause to properly account for
> nulls.
>
> // truncate both sides
> test_pub=# truncate table t1;
> test_sub=# truncate table t1;
>
> // alter the WHERE clause
> test_pub=# alter publication p1 set table t1 where (msg is null or msg
> != 'three');
>
> // insert data at pub
> test_pub=# insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'),
> (4, null), (5, 'five');
> INSERT 0 5
> test_pub=# select * from t1;
> id | msg
> ----+-------
> 1 | one
> 2 | two
> 3 | three
> 4 |
> 5 | five
> (5 rows)
>
> // data at sub (not it includes the row 4)
> test_sub=# select * from t1;
> id | msg
> ----+------
> 1 | one
> 2 | two
> 4 |
> 5 | five
> (4 rows)
>
> ~~
>
> So, IMO the PG docs wording for this part should be relaxed a bit.
>
> e.g.
> BEFORE:
> + A nullable column in the <literal>WHERE</literal> clause could cause the
> + expression to evaluate to false; avoid using columns without not-null
> + constraints in the <literal>WHERE</literal> clause.
> AFTER:
> + A nullable column in the <literal>WHERE</literal> clause could cause the
> + expression to evaluate to false. To avoid unexpected results, any possible
> + null values should be accounted for.
>

Your suggested wording sounds reasonable to me. Euler, others, any thoughts?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-12-25 11:02:29 Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Previous Message Amit Langote 2021-12-25 03:36:00 generic plans and "initial" pruning