Re: PG DOCS - logical replication filtering

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PG DOCS - logical replication filtering
Date: 2022-04-06 10:57:49
Message-ID: CAA4eK1JdwQQsxa+zpsBW5rCxEfXopYx381nwcCyeXk6mpF8ChQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 24, 2022 at 11:48 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>

Review comments:
===============
1.
+ The <literal>WHERE</literal> clause expression is evaluated with the same
+ role used for the replication connection. i.e. the role specified in the
+ <literal>CONNECTION</literal> clause of the <xref
linkend="sql-createsubscription"/>.

Can we use () around i.e. sentence? It looks bit odd to me now.
The <literal>WHERE</literal> clause expression is evaluated with the
same role used for the replication connection (i.e., the role
specified in the <literal>CONNECTION</literal> clause of the <xref
linkend="sql-createsubscription"/>).

2.
+ <para>
+ Whenever an <command>UPDATE</command> is processed, the row filter
+ expression is evaluated for both the old and new row (i.e. before
+ and after the data is updated).

Can we write the below part slightly differently?
Before:
(i.e. before and after the data is updated).
After:
(i.e the data before and after the update).

3.
+ <para>
+ Whenever an <command>UPDATE</command> is processed, the row filter
+ expression is evaluated for both the old and new row (i.e. before
+ and after the data is updated).
+ </para>
+
+ <para>
+ If both evaluations are <literal>true</literal>, it replicates the
+ <command>UPDATE</command> change.
+ </para>
+
+ <para>
+ If both evaluations are <literal>false</literal>, it doesn't replicate
+ the change.
+ </para>

I think we can combine these three separate paragraphs.

4.
+ <para>
+ If the publication contains a partitioned table, the publication parameter
+ <literal>publish_via_partition_root</literal> determines which row filter
+ is used.
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ If <literal>publish_via_partition_root</literal> is
<literal>false</literal>
+ (default), each <emphasis>partition's</emphasis> row filter is used.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ If <literal>publish_via_partition_root</literal> is
<literal>true</literal>,
+ the <emphasis>root partitioned table's</emphasis> row filter is used.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </para>

I think we can combine this into single para as Euler had in his version.

5.
+ <note>
+ <para>
+ Publication <literal>publish</literal> operations are ignored
when copying pre-existing table data.
+ </para>
+ </note>
+
+ <note>
+ <para>
+ If the subscriber is in a release prior to 15, copy pre-existing data
+ doesn't use row filters even if they are defined in the publication.
+ This is because old releases can only copy the entire table data.
+ </para>
+ </note>

I don't see the need for the first <note> here, the second one seems
to convey it.

6.
+ <para>
+ Create some tables to be used in the following examples.
+<programlisting>
+testpub=# CREATE TABLE t1(a int, b int, c text, primary key(a,c));
+CREATE TABLE
+testpub=# CREATE TABLE t2(d int primary key, e int, f int);
+CREATE TABLE
+testpub=# CREATE TABLE t3(g int primary key, h int, i int);
+CREATE TABLE
+</programlisting>
+ </para>
+
+ <para>
+ Create some publications.
+ </para>
+ <para>
+ - notice publication <literal>p1</literal> has 1 table with a row filter.
+ </para>
+ <para>
+ - notice publication <literal>p2</literal> has 2 tables, one without a row
+ filter, and one with a row filter.
+ </para>
+ <para>
+ - notice publication <literal>p3</literal> has 2 tables, both
with row filters.
+<programlisting>
+testpub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a > 5 AND c = 'NSW');
+CREATE PUBLICATION
+testpub=# CREATE PUBLICATION p2 FOR TABLE t1, t2 WHERE (e = 99);
+CREATE PUBLICATION
+testpub=# CREATE PUBLICATION p3 FOR TABLE t2 WHERE (d = 10), t3 WHERE (g = 10);
+CREATE PUBLICATION
+</programlisting>
+ </para>

I think it is better to use the corresponding content from Euler's version.

7.
+ <para>
+ The PSQL command <command>\d</command> shows what publications the table is
+ a member of, as well as that table's row filter expression (if defined) in
+ those publications.
+ </para>
+ <para>
+ - notice table <literal>t1</literal> is a member of 2 publications, but
+ has a row filter only in <literal>p1</literal>.
+ </para>
+ <para>
+ - notice table <literal>t2</literal> is a member of 2 publications, and
+ has a different row filter in each of them.

This looks unnecessary to me. Let's remove this part.

8.
+ <para>
+ - notice that only the rows satisfying the <literal>t1 WHERE</literal>
+ clause of publication <literal>p1</literal> are replicated.

Again, it is better to use Euler's version for this and at the place,
he had in his version. Similarly, adjust other notices if any like
this one.

9. I suggest adding an example for partition tables showing how the
row filter is used based on the 'publish_via_partition_root'
parameter.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2022-04-06 10:59:18 Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file
Previous Message Daniel Gustafsson 2022-04-06 10:53:37 Re: OpenSSL deprectation warnings in Postgres 10-13