RE: row filtering for logical replication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: row filtering for logical replication
Date: 2022-01-13 13:16:33
Message-ID: OS0PR01MB5716EBC728C5087DC80AA05A94539@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, January 13, 2022 6:23 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Jan 12, 2022 at 7:19 PM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Wed, Jan 12, 2022 5:38 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > Attach the v63 patch set which include the following changes.
> >

Thanks for the comments !

> Few comments:
> =============
> 1.
> +
> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>prqual</structfield> <type>pg_node_tree</type>
> + </para>
> + <para>Expression tree (in <function>nodeToString()</function>
> + representation) for the relation's qualifying condition</para></entry>
> + </row>
>
> Let's slightly modify this as: "Expression tree (in
> <function>nodeToString()</function> representation) for the relation's
> qualifying condition. Null if there is no qualifying condition."

Changed.

> 2.
> + A <literal>WHERE</literal> clause allows simple expressions. The simple
> + expression cannot contain any aggregate or window functions,
> non-immutable
> + functions, user-defined types, operators or functions.
>
> This part in the docs should be updated to say something similar to what we
> have in the commit message for this part or maybe additionally in some way we
> can say which other forms of expressions are not allowed.

Temporally used the description in commit message.

> 3.
> + for which the <replaceable
> class="parameter">expression</replaceable> returns
> + false or null will not be published.
> + If the subscription has several publications in which
> + the same table has been published with different
> + <literal>WHERE</literal>
>
> In the above text line spacing appears a bit odd to me. There doesn't seem to be
> a need for extra space after line-2 and line-3 in above-quoted text.

I adjusted these text lines.

> 4.
> /*
> + * Return the relid of the topmost ancestor that is published via this
>
> We normally seem to use Returns in similar places.

Changed

>
> 6.
> +static void
> +transformPubWhereClauses(List *tables, const char *queryString)
>
> To keep the function naming similar to other nearby functions, it is better to
> name this as TransformPubWhereClauses.

Changed.

> 7. In AlterPublicationTables(), won't it better if we
> transformPubWhereClauses() after
> CheckObjSchemaNotAlreadyInPublication() to avoid extra processing in case of
> errors.

Changed.

> 8.
> + /*
> + * Check if the relation is member of the existing schema in the
> + * publication or member of the schema list specified.
> + */
> CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
> PUBLICATIONOBJ_TABLE);
>
> I don't see the above comment addition has anything to do with this patch. Can
> we remove it?

Removed.

> 9.
> CheckCmdReplicaIdentity(Relation rel, CmdType cmd) {
> PublicationActions *pubactions;
> + AttrNumber bad_rfcolnum;
>
> /* We only need to do checks for UPDATE and DELETE. */
> if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
> return;
>
> + if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL) return;
> +
> + /*
> + * It is only safe to execute UPDATE/DELETE when all columns referenced
> + in
> + * the row filters from publications which the relation is in are valid
> + -
> + * i.e. when all referenced columns are part of REPLICA IDENTITY, or
> + the
> + * table does not publish UPDATES or DELETES.
> + */
> + bad_rfcolnum = GetRelationPublicationInfo(rel, true);
>
> Can we name this variable as invalid_rf_column?
Changed.

Attach the V64 patch set which addressed Alvaro, Amit and Peter's comments.

The new version patch also include some other changes:
- Fix a table sync bug[1] by using the SQL suggested by Tang[1]
- Adjust the row filter initialize code related to FOR ALL TABLE IN SCHEMA to
make sure it gets the correct row filter.
- Update the documents.
- Rebased the patch based on recent commit 025b92

[1] https://www.postgresql.org/message-id/OS0PR01MB6113BB510435B16E9F0B2A59FB519%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Best regards,
Hou zj

Attachment Content-Type Size
v64-0002-Row-filter-tab-auto-complete-and-pgdump.patch application/octet-stream 5.8 KB
v64-0001-Allow-specifying-row-filter-for-logical-replication-.patch application/octet-stream 145.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2022-01-13 13:20:37 RE: row filtering for logical replication
Previous Message John Naylor 2022-01-13 12:57:44 Re: do only critical work during single-user vacuum?