RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2022-10-14 02:25:58
Message-ID: OSZPR01MB63105BD4B03B3C1FBFCC7E6CFD249@OSZPR01MB6310.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 24, 2022 12:25 AM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
> Hi,
>
> Thanks for the review!
>

Thanks for your reply.

>
> >
> > 1.
> > In FilterOutNotSuitablePathsForReplIdentFull(), is
> > "nonPartialIndexPathList" a
> > good name for the list? Indexes on only expressions are also be filtered.
> >
> > +static List *
> > +FilterOutNotSuitablePathsForReplIdentFull(List *pathlist)
> > +{
> > + ListCell *lc;
> > + List *nonPartialIndexPathList = NIL;
> >
> >
> Yes, true. We only started filtering the non-partial ones first. Now
> changed to *suitableIndexList*, does that look right?
>

That looks ok to me.

>
>
> > 3.
> > It looks we should change the comment for FindReplTupleInLocalRel() in this
> > patch.
> >
> > /*
> > * Try to find a tuple received from the publication side (in
> > 'remoteslot') in
> > * the corresponding local relation using either replica identity index,
> > * primary key or if needed, sequential scan.
> > *
> > * Local tuple, if found, is returned in '*localslot'.
> > */
> > static bool
> > FindReplTupleInLocalRel(EState *estate, Relation localrel,
> >
> >
> I made a small change, just adding "index". Do you expect a larger change?
>
>

I think that's sufficient.

>
>
> > 5.
> > + if (!AttributeNumberIsValid(mainattno))
> > + {
> > + /*
> > + * There are two cases to consider. First, if the
> > index is a primary or
> > + * unique key, we cannot have any indexes with
> > expressions. So, at this
> > + * point we are sure that the index we deal is not
> > these.
> > + */
> > + Assert(RelationGetReplicaIndex(rel) !=
> > RelationGetRelid(idxrel) &&
> > + RelationGetPrimaryKeyIndex(rel) !=
> > RelationGetRelid(idxrel));
> > +
> > + /*
> > + * For a non-primary/unique index with an
> > expression, we are sure that
> > + * the expression cannot be used for replication
> > index search. The
> > + * reason is that we create relevant index paths
> > by providing column
> > + * equalities. And, the planner does not pick
> > expression indexes via
> > + * column equality restrictions in the query.
> > + */
> > + continue;
> > + }
> >
> > Is it possible that it is a usable index with an expression? I think
> > indexes
> > with an expression has been filtered in
> > FilterOutNotSuitablePathsForReplIdentFull(). If it can't be a usable index
> > with
> > an expression, maybe we shouldn't use "continue" here.
> >
>
>
>
> Ok, I think there are some confusing comments in the code, which I updated.
> Also, added one more explicit Assert to make the code a little more
> readable.
>
> We can support indexes involving expressions but not indexes that are only
> consisting of expressions. FilterOutNotSuitablePathsForReplIdentFull()
> filters out the latter, see IndexOnlyOnExpression().
>
> So, for example, if we have an index as below, we are skipping the
> expression while building the index scan keys:
>
> CREATE INDEX people_names ON people (firstname, lastname, (id || '_' ||
> sub_id));
>
> We can consider removing `continue`, but that'd mean we should also adjust
> the following code-block to handle indexprs. To me, that seems like an edge
> case to implement at this point, given such an index is probably not
> common. Do you think should I try to use the indexprs as well while
> building the scan key?
>
> I'm mostly trying to keep the complexity small. If you suggest this
> limitation should be lifted, I can give it a shot. I think the limitation I
> leave here is with a single sentence: *The index on the subscriber can only
> use simple column references. *
>

Thanks for your explanation. I get it and think it's OK.

> > 6.
> > In the following case, I got a result which is different from HEAD, could
> > you
> > please look into it?
> >
> > -- publisher
> > CREATE TABLE test_replica_id_full (x int);
> > ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;
> > CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full;
> >
> > -- subscriber
> > CREATE TABLE test_replica_id_full (x int, y int);
> > CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x,y);
> > CREATE SUBSCRIPTION tap_sub_rep_full_0 CONNECTION 'dbname=postgres
> > port=5432' PUBLICATION tap_pub_rep_full;
> >
> > -- publisher
> > INSERT INTO test_replica_id_full VALUES (1);
> > UPDATE test_replica_id_full SET x = x + 1 WHERE x = 1;
> >
> > The data in subscriber:
> > on HEAD:
> > postgres=# select * from test_replica_id_full ;
> > x | y
> > ---+---
> > 2 |
> > (1 row)
> >
> > After applying the patch:
> > postgres=# select * from test_replica_id_full ;
> > x | y
> > ---+---
> > 1 |
> > (1 row)
> >
> >
> Ops, good catch. it seems we forgot to have:
>
> skey[scankey_attoff].sk_flags |= SK_SEARCHNULL;
>
> On head, the index used for this purpose could only be the primary key or
> unique key on NOT NULL columns. Now, we do allow NULL values, and need to
> search for them. Added that (and your test) to the updated patch.
>
> As a semi-related note, tuples_equal() decides `true` for (NULL = NULL). I
> have not changed that, and it seems right in this context. Do you see any
> issues with that?
>
> Also, I realized that the functions in the execReplication.c expect only
> btree indexes. So, I skipped others as well. If that makes sense, I can
> work on a follow-up patch after we can merge this, to remove some of the
> limitations mentioned here.

Thanks for fixing it and updating the patch, I didn't see any issue about it.

Here are some comments on v17 patch.

1.
-LogicalRepRelMapEntry *
+LogicalRepPartMapEntry *
logicalrep_partition_open(LogicalRepRelMapEntry *root,
Relation partrel, AttrMap *map)
{

Is there any reason to change the return type of logicalrep_partition_open()? It
seems ok without this change.

2.

+ * of the relation cache entry (e.g., such as ANALYZE or
+ * CREATE/DROP index on the relation).

"e.g." and "such as" mean the same. I think we remove one of them.

3.
+$node_subscriber->poll_query_until(
+ 'postgres', q{select (idx_scan = 2) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+) or die "Timed out while waiting for'check subscriber tap_sub_rep_full deletes one row via index";
+

+$node_subscriber->poll_query_until(
+ 'postgres', q{select (idx_scan = 1) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idy';}
+) or die "Timed out while waiting for'check subscriber tap_sub_rep_full deletes one row via index";

"for'check" -> "for check"

3.
+$node_subscriber->safe_psql('postgres',
+ "SELECT pg_reload_conf();");
+
+# Testcase start: SUBSCRIPTION BEHAVIOR WITH ENABLE_INDEXSCAN
+# ====================================================================
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
+

"Testcase start" in the comment should be "Testcase end".

4.
There seems to be a problem in the following scenario, which results in
inconsistent data between publisher and subscriber.

-- publisher
CREATE TABLE test_replica_id_full (x int, y int);
ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;
CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full;

-- subscriber
CREATE TABLE test_replica_id_full (x int, y int);
CREATE UNIQUE INDEX test_replica_id_full_idx ON test_replica_id_full(x);
CREATE SUBSCRIPTION tap_sub_rep_full_0 CONNECTION 'dbname=postgres port=5432' PUBLICATION tap_pub_rep_full;

-- publisher
INSERT INTO test_replica_id_full VALUES (NULL,1);
INSERT INTO test_replica_id_full VALUES (NULL,2);
INSERT INTO test_replica_id_full VALUES (NULL,3);
update test_replica_id_full SET x=1 where y=2;

The data in publisher:
postgres=# select * from test_replica_id_full order by y;
x | y
---+---
| 1
1 | 2
| 3
(3 rows)

The data in subscriber:
postgres=# select * from test_replica_id_full order by y;
x | y
---+---
| 2
1 | 2
| 3
(3 rows)

There is no such problem on master branch.

Regards,
Shi yu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-10-14 02:52:11 Re: create subscription - improved warning message
Previous Message Peter Smith 2022-10-14 02:15:58 Re: GUC values - recommended way to declare the C variables?