RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Peter Smith' <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Date: 2023-07-12 03:07:04
Message-ID: TYAPR01MB58663009FA044B8E37488EA9F536A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Peter,

Thank you for reviewing! PSA new version.

> 1.
> 89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY
> on the
> subscriber, but only the BTree index could be used. This commit extends the
> limitation, now the Hash index can be also used.
>
> ~
>
> Before giving details about the problems of the other index types it
> might be good to summarize why these 2 types are OK.
>
> SUGGESTION
> These 2 types of indexes are the only ones supported because only these
> - use fix strategy numbers
> - implement the "equality" strategy
> - implement the function amgettuple()

Added.

>
> 2.
> I'm not sure why the next paragraphs are numbered 1) and 2). Is that
> necessary? It seems maybe a cut/paste hangover from the similar code
> comment.

Yeah, this was just copied from code comments. Numbers were removed.

> 3.
> 1) Other indexes do not have a fixed set of strategy numbers at all. In
> build_replindex_scan_key(), the operator which corresponds to
> "equality comparison"
> is searched by using the strategy number, but it is difficult for such indexes.
> For example, GiST index operator classes for two-dimensional geometric
> objects have
> a comparison "same", but its strategy number is different from the
> gist_int4_ops,
> which is a GiST index operator class that implements the B-tree equivalent.
>
> ~
>
> Don't need to say "at all"

Removed.

> 4.
> 2) Some other indexes do not implement "amgettuple".
> RelationFindReplTupleByIndex()
> assumes that tuples could be fetched one by one via
> index_getnext_slot(), but such
> indexes are not supported.
>
> ~
>
> 4a.
> "Some other indexes..." --> Maybe give an example here (e.g. XXX, YYY)
> of indexes that do not implement the function.

I clarified like "BRIN and GIN indexes do not implement...", which are the built-in
indexes. Actually the bloom index cannot be supported due to the same reason, but
I did not mention because it is not in core.

> 4b.
> BEFORE
> RelationFindReplTupleByIndex() assumes that tuples could be fetched
> one by one via index_getnext_slot(), but such indexes are not
> supported.
>
> AFTER
> RelationFindReplTupleByIndex() assumes that tuples will be fetched one
> by one via index_getnext_slot(), but this currently requires the
> "amgetuple" function.

Changed.

> src/backend/executor/execReplication.c
>
> 5.
> + * 2) Some other indexes do not implement "amgettuple".
> + * RelationFindReplTupleByIndex() assumes that tuples could be fetched one by
> + * one via index_getnext_slot(), but such indexes are not supported. To make it
> + * use index_getbitmap() must be used, but not done yet because of the above
> + * reason.
> + */
> +int
> +get_equal_strategy_number_for_am(Oid am)
>
> ~
>
> Change this text to better match exactly in the commit message (see
> previous review comments above).

Copied from commit message.

> Also I am not sure it is necessary to
> say how it *might* be implemented in the future if somebody wanted to
> enhance it to work also for "amgetbitmap" function. E.g. do we need
> that sentence "To make it..."

Added, how do you think?

> 6. get_equal_strategy_number_for_am
>
> + case GIST_AM_OID:
> + case SPGIST_AM_OID:
> + case GIN_AM_OID:
> + case BRIN_AM_OID:
> + default:
>
> I am not sure it is necessary to spell out all these not-supported
> cases in the switch. If seems sufficient just to say 'default:'
> doesn't it?

Yeah, it's sufficient. This is a garbage for previous PoC.

> 7. get_equal_strategy_number
>
> Two blank lines are following this function

Removed.

> 8. build_replindex_scan_key
>
> - * This is not generic routine, it expects the idxrel to be a btree,
> non-partial
> - * and have at least one column reference (i.e. cannot consist of only
> - * expressions).
> + * This is not generic routine, it expects the idxrel to be a btree or a hash,
> + * non-partial and have at least one column reference (i.e. cannot consist of
> + * only expressions).
>
> Take care. AFAIK this change will clash with changes Sawawa-san is
> making to the same code comment in another thread here [1].

Thanks for reminder. I thought that this change seems not needed anymore if the patch
is pushed. But anyway I kept it because this may be pushed first.

> src/backend/replication/logical/relation.c
>
> 9. FindUsableIndexForReplicaIdentityFull
>
> * Returns the oid of an index that can be used by the apply worker to scan
> - * the relation. The index must be btree, non-partial, and have at least
> - * one column reference (i.e. cannot consist of only expressions). These
> + * the relation. The index must be btree or hash, non-partial, and have at
> + * least one column reference (i.e. cannot consist of only expressions). These
> * limitations help to keep the index scan similar to PK/RI index scans.
>
> ~
>
> Take care. AFAIK this change will clash with changes Sawawa-san is
> making to the same code comment in another thread here [1].

Thanks for reminder. I thought that this change must be kept even if it will be
pushed. We must check the thread...

> 10.
> + /* Check whether the index is supported or not */
> + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
> + != InvalidStrategy));
> +
> + is_partial = (indexInfo->ii_Predicate != NIL);
> + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
> +
> + return is_suitable_type && !is_partial && !is_only_on_expression;
>
> I am not sure if the function IsIndexOnlyExpression() even needed
> anymore. Isn't it sufficient just to check up-front is the leftmost
> INDEX field is a column and that covers this condition also? Actually,
> this same question is also open in the Sawada-san thread [1].
>
> ======
> .../subscription/t/032_subscribe_use_index.pl
>
> 11.
> AFAIK there is no test to verify that the leftmost index field must be
> a column (e.g. cannot be an expression). Yes, there are some tests for
> *only* expressions like (expr, expr, expr), but IMO there should be
> another test for an index like (expr, expr, column) which should also
> fail because the column is not the leftmost field.

I'm not sure they should be fixed in the patch. You have raised these points
at the Sawada-san's thread[1] and not sure he has done.
Furthermore, these points are not related with our initial motivation.
Fork, or at least it should be done by another patch.

[1]: https://www.postgresql.org/message-id/CAHut%2BPv3AgAnP%2BJTsPseuU-CT%2BOrSGiqzxqw4oQmWeKLkAea4Q%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v4-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patch application/octet-stream 12.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-07-12 03:21:53 Re: Cleaning up threading code
Previous Message Andy Fan 2023-07-12 02:57:44 Re: The same 2PC data maybe recovered twice