Re: row filtering for logical replication

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Peter Smith" <smithpb2250(at)gmail(dot)com>
Cc: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Rahila Syed" <rahilasyed90(at)gmail(dot)com>, "Peter Eisentraut" <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "David Steele" <david(at)pgmasters(dot)net>, "Craig Ringer" <craig(at)2ndquadrant(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Amit Langote" <amitlangote09(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-11 19:39:24
Message-ID: 532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote:
> Hi.
>
> I have been looking at the latest patch set (v16). Below are my review
> comments and some patches.
>
Peter, thanks for your detailed review. Comments are inline.

> 1. Patch 0001 comment - typo
>
> you can optionally filter rows that does not satisfy a WHERE condition
>
> typo: does/does
Fixed.

>
> 2. Patch 0001 comment - typo
>
> The WHERE clause should probably contain only columns that are part of
> the primary key or that are covered by REPLICA IDENTITY. Otherwise,
> and DELETEs won't be replicated.
>
> typo: "Otherwise, and DELETEs" ??
Fixed.

> 3. Patch 0001 comment - typo and clarification
>
> If your publication contains partitioned table, the parameter
> publish_via_partition_root determines if it uses the partition row filter (if
> the parameter is false -- default) or the partitioned table row filter.
>
> Typo: "contains partitioned table" -> "contains a partitioned table"
Fixed.

> Also, perhaps the text "or the partitioned table row filter." should
> say "or the root partitioned table row filter." to disambiguate the
> case where there are more levels of partitions like A->B->C. e.g. What
> filter does C use?
I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
root partitioned table is used. We should improve that sentence too.

> 4. src/backend/catalog/pg_publication.c - misleading names
>
> -publication_add_relation(Oid pubid, Relation targetrel,
> +publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
> bool if_not_exists)
>
> Leaving this parameter name as "targetrel" seems a bit misleading now
> in the function code. Maybe this should be called something like "pri"
> which is consistent with other places where you have declared
> PublicationRelationInfo.
>
> Also, consider declaring some local variables so that the patch may
> have less impact on existing code. e.g.
> Oid relid = pri->relid
> Relation *targetrel = relationinfo->relation
Done.

> 5. src/backend/commands/publicationcmds.c - simplify code
>
> - rels = OpenTableList(stmt->tables);
> + if (stmt->tableAction == DEFELEM_DROP)
> + rels = OpenTableList(stmt->tables, true);
> + else
> + rels = OpenTableList(stmt->tables, false);
>
> Consider writing that code more simply as just:
>
> rels = OpenTableList(stmt->tables, stmt->tableAction == DEFELEM_DROP);
It is not a common pattern to use an expression as a function argument in
Postgres. I prefer to use a variable with a suggestive name.

> 6. src/backend/commands/publicationcmds.c - bug?
>
> - CloseTableList(rels);
> + CloseTableList(rels, false);
> }
>
> Is this a potential bug? When you called OpenTableList the 2nd param
> was maybe true/false, so is it correct to be unconditionally false
> here? I am not sure.
Good catch.

> 7. src/backend/commands/publicationcmds.c - OpenTableList function comment.
>
> * Open relations specified by a RangeVar list.
> + * AlterPublicationStmt->tables has a different list element, hence, is_drop
> + * indicates if it has a RangeVar (true) or PublicationTable (false).
> * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
> * add them to a publication.
>
> I am not sure about this. Should that comment instead say "indicates
> if it has a Relation (true) or PublicationTable (false)"?
Fixed.

> 8. src/backend/commands/publicationcmds.c - OpenTableList
>8

> For some reason it feels kind of clunky to me for this function to be
> processing the list differently according to the 2nd param. e.g. the
> name "is_drop" seems quite unrelated to the function code, and more to
> do with where it was called from. Sorry, I don't have any better ideas
> for improvement atm.
My suggestion is to rename it to "pub_drop_table".

> 9. src/backend/commands/publicationcmds.c - OpenTableList bug?
>8

> I felt maybe this is a possible bug here because there seems no code
> explicitly assigning the whereClause = NULL if "is_drop" is true so
> maybe it can have a garbage value which could cause problems later.
> Maybe this is fixed by using palloc0.
Fixed.

> 10. src/backend/commands/publicationcmds.c - CloseTableList function comment
>8

> Probably the meaning of "is_drop" should be described in this function comment.
Done.

> 11. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry signature.
>
> -static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
> +static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
>
> I see that this function signature is modified but I did not see how
> this parameter refactoring is actually related to the RowFilter patch.
> Perhaps I am mistaken, but IIUC this only changes the relid =
> RelationGetRelid(rel); to be done inside this function instead of
> being done outside by the callers.
It is not critical for this patch so I removed it.

> 12. src/backend/replication/pgoutput/pgoutput.c - missing function comments
>
> The static functions create_estate_for_relation and
> pgoutput_row_filter_prepare_expr probably should be commented.
Done.

> 13. src/backend/replication/pgoutput/pgoutput.c -
> pgoutput_row_filter_prepare_expr function name
>
> +static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode,
> EState *estate);
>
> This function has an unfortunate name with the word "prepare" in it. I
> wonder if a different name can be found for this function to avoid any
> confusion with pgoutput functions (coming soon) which are related to
> the two-phase commit "prepare".
The word "prepare" is related to the executor context. The function name
contains "row_filter" that is sufficient to distinguish it from any other
function whose context is "prepare". I replaced "prepare" with "init".

> 14. src/bin/psql/describe.c
>
> + if (!PQgetisnull(tabres, j, 2))
> + appendPQExpBuffer(&buf, " WHERE (%s)",
> + PQgetvalue(tabres, j, 2));
>
> Because the where-clause value already has enclosing parentheses so
> using " WHERE (%s)" seems overkill here. e.g. you can see the effect
> in your src/test/regress/expected/publication.out file. I think this
> should be changed to " WHERE %s" to give better output.
Peter E suggested that extra parenthesis be added. See 0005 [1].

> 15. src/include/catalog/pg_publication.h - new typedef
>
> +typedef struct PublicationRelationInfo
> +{
> + Oid relid;
> + Relation relation;
> + Node *whereClause;
> +} PublicationRelationInfo;
> +
>
> The new PublicationRelationInfo should also be added
> src/tools/pgindent/typedefs.list
Patches usually don't update typedefs.list. Check src/tools/pgindent/README.

> 16. src/include/nodes/parsenodes.h - new typedef
>
> +typedef struct PublicationTable
> +{
> + NodeTag type;
> + RangeVar *relation; /* relation to be published */
> + Node *whereClause; /* qualifications */
> +} PublicationTable;
>
> The new PublicationTable should also be added src/tools/pgindent/typedefs.list
Idem.

> 17. sql/publication.sql - show more output
>
> +CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1,
> testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
> +RESET client_min_messages;
> +ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000
> AND e < 2000);
> +ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
> +-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another
> WHERE expression)
> +ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300
> AND e < 500);
> +-- fail - functions disallowed
> +ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
> +-- fail - WHERE not allowed in DROP
> +ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
> +\dRp+ testpub5
>
> I felt that it would be better to have a "\dRp+ testpub5" after each
> of the valid ALTER PUBLICATION steps to show the intermediate results
> also; not just the final one at the end.
Done.

> 18. src/test/subscription/t/020_row_filter.pl - rename file
>
> I think this file should be renamed to 021_row_filter.pl as there is
> already an 020 TAP test present.
Done.

> 19. src/test/subscription/t/020_row_filter.pl - test comments
>
> AFAIK the test cases are all OK, but it was really quite hard to
> review these TAP tests to try to determine what the expected results
> should be.
I included your comments but heavily changed it.

> 20. src/test/subscription/t/020_row_filter.pl - missing test case?
>
> There are some partition tests, but I did not see any test that was
> like 3 levels deep like A->B->C, so I was not sure if there is any
> case C would ever make use of the filter of its parent B, or would it
> only use the filter of the root A?
I didn't include it yet. There is an issue with initial synchronization and
partitioned table when you set publish_via_partition_root. I'll start another
thread for this issue.

> 21. src/test/subscription/t/020_row_filter.pl - missing test case?
>
> If the same table is in multiple publications they can each have a row
> filter. And a subscription might subscribe to some but not all of
> those publications. I think this scenario is only partly tested.
8<
> e.g.
> pub_1 has tableX with RowFilter1
> pub_2 has tableX with RowFilter2
>
> Then sub_12 subscribes to pub_1, pub_2
> This is already tested in your TAP test (I think) and it makes sure
> both filters are applied
>
> But if there was also
> pub_3 has tableX with RowFilter3
>
> Then sub_12 still should only be checking the filtered RowFilter1 AND
> RowFilter2 (but NOT row RowFilter3). I think this scenario is not
> tested.
I added a new publication tap_pub_not_used to cover this case.

> POC PATCH FOR PLAN CACHE
> ========================
>
> PSA a POC patch for a plan cache which gets used inside the
> pgoutput_row_filter function instead of calling prepare for every row.
> I think this is implementing something like Andes was suggesting a
> while back [1].
I also had a WIP patch for it (that's very similar to your patch) so I merged
it.

This cache mechanism consists of caching ExprState and avoid calling
pgoutput_row_filter_init_expr() for every single row. Greg N suggested in
another email that tuple table slot should also be cached to avoid a few cycles
too. It is also included in this new patch.

> Measurements with/without this plan cache:
>
> Time spent processing within the pgoutput_row_filter function
> - Data was captured using the same technique as the
> 0002-Measure-row-filter-overhead.patch.
> - Inserted 1000 rows, sampled data for the first 100 times in this function.
> not cached: average ~ 28.48 us
> cached: average ~ 9.75 us
>
> Replication times:
> - Using tables and row filters same as in Onder's commands_to_test_perf.sql [2]
> 100K rows - not cached: ~ 42sec, 43sec, 44sec
> 100K rows - cached: ~ 41sec, 42sec, 42 sec.
>
> There does seem to be a tiny gain achieved by having the plan cache,
> but I think the gain might be a lot less than what people were
> expecting.
I did another measure using as baseline the previous patch (v16).

without cache (v16)
---------------------------

mean: 1.46 us
stddev: 2.13 us
median: 1.39 us
min-max: [0.69 .. 1456.69] us
percentile(99): 3.15 us
mode: 0.91 us

with cache (v18)
-----------------------

mean: 0.63 us
stddev: 1.07 us
median: 0.55 us
min-max: [0.29 .. 844.87] us
percentile(99): 1.38 us
mode: 0.41 us

It represents -57%. It is a really good optimization for just a few extra lines
of code.

[1] https://www.postgresql.org/message-id/57373e8b-1264-cd37-404e-8edbcf7884cc%40enterprisedb.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v18-0001-Row-filter-for-logical-replication.patch text/x-patch 73.2 KB
v18-0002-Measure-row-filter-overhead.patch text/x-patch 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2021-07-11 19:48:26 Re: row filtering for logical replication
Previous Message Soumyadeep Chakraborty 2021-07-11 18:54:23 Re: pg_stats and range statistics