RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2022-10-20 02:37:43
Message-ID: OSZPR01MB63109FF52120649EECACC7AEFD2A9@OSZPR01MB6310.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 19, 2022 12:05 AM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>
> Attached v19.
>

Thanks for updating the patch. Here are some comments on v19.

1.
In execReplication.c:

+ TypeCacheEntry **eq = NULL; /* only used when the index is not unique */

Maybe the comment here should be changed. Now it is used when the index is not
primary key or replica identity index.

2.
+# wait until the index is created
+$node_subscriber->poll_query_until(
+ 'postgres', q{select count(*)=1 from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates one row via index";

The message doesn't seem right, should it be changed to "Timed out while
waiting for creating index test_replica_id_full_idx"?

3.
+# now, ingest more data and create index on column y which has higher cardinality
+# then create an index on column y so that future commands uses the index on column
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT 50, i FROM generate_series(0,3100)i;");

The comment say "create (an) index on column y" twice, maybe it can be changed
to:

now, ingest more data and create index on column y which has higher cardinality,
so that future commands will use the index on column y

4.
+# deletes 200 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6);");
+
+# wait until the index is used on the subscriber
+$node_subscriber->poll_query_until(
+ 'postgres', q{select (idx_scan = 200) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates 200 rows via index";

It would be better to call wait_for_catchup() after DELETE. (And some other
places in this file.)
Besides, the "updates" in the message should be "deletes".

5.
+# wait until the index is used on the subscriber
+$node_subscriber->poll_query_until(
+ 'postgres', q{select sum(idx_scan)=10 from pg_stat_all_indexes where indexrelname ilike 'users_table_part_%';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates partitioned table";

Maybe we should say "updates partitioned table with index" in this message.

Regards,
Shi yu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-10-20 02:43:12 Re: Documentation for building with meson
Previous Message Kyotaro Horiguchi 2022-10-20 01:26:47 Re: thinko in basic_archive.c