Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-10 04:04:35
Message-ID: CAHut+Ps_RMwV=ZOG4D4Q=Qm5idytEdNFn7EONPhK08Azi4yFsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for patch v39-0001 (mostly the test code).

======
src/backend/replication/logical/relation.c

1. FindUsableIndexForReplicaIdentityFull

+static Oid
+FindUsableIndexForReplicaIdentityFull(Relation localrel)
+{
+ List *indexlist = RelationGetIndexList(localrel);
+ Oid usableIndex = InvalidOid;
+ ListCell *lc;
+
+ foreach(lc, indexlist)
+ {
+ Oid idxoid = lfirst_oid(lc);
+ bool isUsableIndex;
+ Relation indexRelation = index_open(idxoid, AccessShareLock);
+ IndexInfo *indexInfo = BuildIndexInfo(indexRelation);
+
+ isUsableIndex = IsIndexUsableForReplicaIdentityFull(indexInfo);
+
+ index_close(indexRelation, AccessShareLock);
+
+ if (isUsableIndex)
+ {
+ /* we found one eligible index, don't need to continue */
+ usableIndex = idxoid;
+ break;
+ }
+ }
+
+ return usableIndex;
+}

This comment is not functional -- if you prefer the code as-is, then
ignore this comment.

But, personally I would:
- Move some of that code from the declarations. I feel it would be
better if the index_open/index_close were both in the code-body
instead of half in declarations and half not.
- Remove the 'usableIndex' variable, and just return directly.
- Shorten all the long names (and use consistent 'idx' instead of
sometimes 'idx' and sometimes 'index')

SUGGESTION (YMMV)

static Oid
FindUsableIndexForReplicaIdentityFull(Relation localrel)
{
List *idxlist = RelationGetIndexList(localrel);
ListCell *lc;

foreach(lc, idxlist)
{
Oid idxoid = lfirst_oid(lc);
bool isUsableIdx;
Relation idxRel;
IndexInfo *idxInfo;

idxRel = index_open(idxoid, AccessShareLock);
idxInfo = BuildIndexInfo(idxRel);
isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
index_close(idxRel, AccessShareLock);

/* Return the first eligible index found */
if (isUsableIdx)
return idxoid;
}

return InvalidOid;
}

======
.../subscription/t/032_subscribe_use_index.pl

2. SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES

2a.
# Testcase start: SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
#
# This test ensures that after CREATE INDEX, the subscriber can automatically
# use one of the indexes (provided that it fulfils the requirements).
# Similarly, after DROP index, the subscriber can automatically switch to
# sequential scan

The last sentence is missing full-stop.

~

2b.
# now, create index and see that the index is used
$node_subscriber->safe_psql('postgres',
"CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x)");

Don't say "and see that the index is used". Yes, that is what this
whole test is doing, but it is not what the psql following this
comment is doing.

~

2c.
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x = 15;");
$node_publisher->wait_for_catchup($appname);

# wait until the index is used on the subscriber

The double blank lines here should be single.

~

2d.
# now, the update could either use the test_replica_id_full_idx or
# test_replica_id_full_idy index it is not possible for user to control
# which index to use

This sentence should break at "it".

Aso "user" --> "the user"

SUGGESTION
# now, the update could either use the test_replica_id_full_idx or
# test_replica_id_full_idy index; it is not possible for the user to control
# which index to use

~

2e.
# let's also test dropping test_replica_id_full_idy and
# hence use test_replica_id_full_idx

I think you ought to have dropped the other (first) index because we
already know that the first index had been used (from earlier), but we
are not 100% sure if the 'y' index has been chosen yet.

~~~~

3. SUBSCRIPTION USES INDEX WITH MULTIPLE ROWS AND COLUMNS

3a.
# deletes 20 rows
$node_publisher->safe_psql('postgres',
"DELETE FROM test_replica_id_full WHERE x IN (5, 6);");

# updates 20 rows
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2);");

"deletes" --> "delete"

"updates" --> "update"

~~~

4. SUBSCRIPTION USES INDEX WITH DROPPED COLUMNS

# updates 200 rows
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x IN (5, 6);");

"updates" --> "update"

"200 rows" ??? is that right -- 20 maybe ???

~~~

5. SUBSCRIPTION USES INDEX ON PARTITIONED TABLES

5a.
# updates rows and moves between partitions
$node_publisher->safe_psql('postgres',
"UPDATE users_table_part SET value_1 = 0 WHERE user_id = 4;");

"updates rows and moves between partitions" --> "update rows, moving
them to other partitions"

~

5b.
# deletes rows from different partitions

"deletes" --> "delete"

~~~

6. SUBSCRIPTION DOES NOT USE INDEXES WITH ONLY EXPRESSIONS OR PARTIAL INDEX

6a.
# update 2 rows
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname = 'first_name_1';");
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname =
'first_name_2' AND lastname = 'last_name_2';");

IMO 'Nan' seemed a curious name to assign as a test value, because it
seems like it has a special meaning but in reality, I don't think it
does. Even 'xxx' would be better.

~

6b.
# make sure none of the indexes is not used on the subscriber
$result = $node_subscriber->safe_psql('postgres',
"select sum(idx_scan) from pg_stat_all_indexes where indexrelname
IN ('people_names_expr_only', 'people_names_partial')");
is($result, qq(0), 'ensure subscriber tap_sub_rep_full updates two
rows via seq. scan with index on expressions');

~

Looks like a typo in this comment: "none of the indexes is not used" ???

~~~

7. SUBSCRIPTION CAN USE INDEXES WITH EXPRESSIONS AND COLUMNS

7a.
# update 2 rows
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname = 'first_name_1';");
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname =
'first_name_3' AND lastname = 'last_name_3';");

Same as #6a, 'Nan' seems like a strange test value to assign to a name.

~~~

8. Some NULL values

$node_publisher->safe_psql('postgres',
"INSERT INTO test_replica_id_full VALUES (1), (2), (3);");
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x = 1;");
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x = 3;");

~

8a.
For some reason, this insert/update psql was not commented.

~

8b.
Maybe this test data could be more obvious by explicitly inserting the NULLs?

~~~

9. Unique index that is not primary key or replica identity

9a.
Why are other "Testcase start" comments all uppercase but not this one?

~~~

10. SUBSCRIPTION USES INDEX WITH PUB/SUB different data

Why is there a mixed case in this "Test case:" comment?

~~~

11. PUBLICATION LACKS THE COLUMN ON THE SUBS INDEX

11a.
# The subsriber has an additional column compared to publisher,
# and the index is on that column. We still pick the index scan
# on the subscriber even though it is practically similar to
# sequential scan

Typo "subsriber"

Missing full-stop on last sentence.

~

11b.
# make sure that the subscriber has the correct data
# we only deleted 1 row
$result = $node_subscriber->safe_psql('postgres',
"SELECT sum(x) FROM test_replica_id_full");
is($result, qq(232), 'ensure subscriber has the correct data at the
end of the test');

Why does that say "deleted 1 row", when the previous operation was not a DELETE?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-03-10 04:24:13 Re: Combine pg_walinspect till_end_of_wal functions with others
Previous Message Michael Paquier 2023-03-10 03:58:38 Re: Combine pg_walinspect till_end_of_wal functions with others