Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Date: 2023-07-12 00:04:09
Message-ID: CAHut+Puuj3YhONUq8sWrYOMeq+KrDpA_CzCkY4fedF7NNRv1eQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, here are some review comments for the v3 patch

======
Commit message

1.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used.

~

Before giving details about the problems of the other index types it
might be good to summarize why these 2 types are OK.

SUGGESTION
These 2 types of indexes are the only ones supported because only these
- use fix strategy numbers
- implement the "equality" strategy
- implement the function amgettuple()

~~~

2.
I'm not sure why the next paragraphs are numbered 1) and 2). Is that
necessary? It seems maybe a cut/paste hangover from the similar code
comment.

~~~

3.
1) Other indexes do not have a fixed set of strategy numbers at all. In
build_replindex_scan_key(), the operator which corresponds to
"equality comparison"
is searched by using the strategy number, but it is difficult for such indexes.
For example, GiST index operator classes for two-dimensional geometric
objects have
a comparison "same", but its strategy number is different from the
gist_int4_ops,
which is a GiST index operator class that implements the B-tree equivalent.

~

Don't need to say "at all"

~~~

4.
2) Some other indexes do not implement "amgettuple".
RelationFindReplTupleByIndex()
assumes that tuples could be fetched one by one via
index_getnext_slot(), but such
indexes are not supported.

~

4a.
"Some other indexes..." --> Maybe give an example here (e.g. XXX, YYY)
of indexes that do not implement the function.

~

4b.
BEFORE
RelationFindReplTupleByIndex() assumes that tuples could be fetched
one by one via index_getnext_slot(), but such indexes are not
supported.

AFTER
RelationFindReplTupleByIndex() assumes that tuples will be fetched one
by one via index_getnext_slot(), but this currently requires the
"amgetuple" function.

======
src/backend/executor/execReplication.c

5.
+ * 2) Some other indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex() assumes that tuples could be fetched one by
+ * one via index_getnext_slot(), but such indexes are not supported. To make it
+ * use index_getbitmap() must be used, but not done yet because of the above
+ * reason.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)

~

Change this text to better match exactly in the commit message (see
previous review comments above). Also I am not sure it is necessary to
say how it *might* be implemented in the future if somebody wanted to
enhance it to work also for "amgetbitmap" function. E.g. do we need
that sentence "To make it..."

~~~

6. get_equal_strategy_number_for_am

+ case GIST_AM_OID:
+ case SPGIST_AM_OID:
+ case GIN_AM_OID:
+ case BRIN_AM_OID:
+ default:

I am not sure it is necessary to spell out all these not-supported
cases in the switch. If seems sufficient just to say 'default:'
doesn't it?

~~~

7. get_equal_strategy_number

Two blank lines are following this function

~~~

8. build_replindex_scan_key

- * This is not generic routine, it expects the idxrel to be a btree,
non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).

Take care. AFAIK this change will clash with changes Sawawa-san is
making to the same code comment in another thread here [1].

======
src/backend/replication/logical/relation.c

9. FindUsableIndexForReplicaIdentityFull

* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.

~

Take care. AFAIK this change will clash with changes Sawawa-san is
making to the same code comment in another thread here [1].

~

10.
+ /* Check whether the index is supported or not */
+ is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ return is_suitable_type && !is_partial && !is_only_on_expression;

I am not sure if the function IsIndexOnlyExpression() even needed
anymore. Isn't it sufficient just to check up-front is the leftmost
INDEX field is a column and that covers this condition also? Actually,
this same question is also open in the Sawada-san thread [1].

======
.../subscription/t/032_subscribe_use_index.pl

11.
AFAIK there is no test to verify that the leftmost index field must be
a column (e.g. cannot be an expression). Yes, there are some tests for
*only* expressions like (expr, expr, expr), but IMO there should be
another test for an index like (expr, expr, column) which should also
fail because the column is not the leftmost field.

------
[1] Sawada-san doc for RI restriction -
https://www.postgresql.org/message-id/CAD21AoBzp9H2WV4kDagat2WUOsiYJLo10zJ1E5dZYnRTx1pc_g%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Austalia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-07-12 00:16:18 Re: unrecognized node type while displaying a Path due to dangling pointer
Previous Message Michael Paquier 2023-07-11 23:41:45 Re: Got FATAL in lock_twophase_recover() during recovery