Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

From: Önder Kalacı <onderkalaci(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Date: 2023-06-26 07:52:31
Message-ID: CACawEhWf8fXKicVfJVGs+Ce2U0LcX+-TJ2JBKhvREVGM0g7r+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Hayato, all

>
>
> This is a follow-up thread of [1]. The commit allowed subscribers to use
> indexes
> other than PK and REPLICA IDENTITY when REPLICA IDENTITY is FULL on
> publisher,
> but the index must be a B-tree. In this proposal, I aim to extend this
> functionality to allow
> for hash indexes and potentially other types.
>
>
Cool, thanks for taking the time to work on this.

> # Current problem
>
> The current limitation comes from the function build_replindex_scan_key(),
> specifically these lines:
>

When I last dealt with the same issue, I was examining it from a slightly
broader perspective. I think
my conclusion was that RelationFindReplTupleByIndex() is designed for the
constraints of UNIQUE INDEX
and Primary Key. Hence, btree limitation was given.

So, my main point is that it might be useful to check
RelationFindReplTupleByIndex() once more in detail
to see if there is anything else that is specific to btree indexes.

build_replindex_scan_key() is definitely one of the major culprits but see
below as well.

I think we should also be mindful about tuples_equal() function. When an
index returns more than
one tuple, we rely on tuples_equal() function to make sure non-relevant
tuples are skipped.

For btree indexes, it was safe to rely on that function as the columns that
are indexed using btree
always have equality operator. I think we can safely assume the same for
hash indexes.

However, say we indexed "point" type using "gist" index. Then, if we let
this logic to kick in,
I think tuples_equal() would fail saying that there is no equality operator
exists.

One might argue that it is already the case for RelationFindReplTupleSeq()
or when you
have index but the index on a different column. But still, it seems useful
to make sure
you are aware of this limitation as well.

>
> ## Current difficulties
>
> The challenge with supporting other indexes is that they lack a fixed set
> of strategies,
> making it difficult to choose the correct strategy number based on the
> index
> access method. Even within the same index type, different operator classes
> can
> use different strategy numbers for the same operation.
> E.g. [2] shows that number 6 can be used for the purpose, but other
> operator classes
> added by btree_gist [3] seem to use number 3 for the euqlaity comparison.
>
>
Also, build_replindex_scan_key() seems like a too late place to check this?
I mean, what
if there is no equality operator, how should code react to that? It
effectively becomes
RelationFindReplTupleSeq(), so maybe better follow that route upfront?

In other words, that decision should maybe
happen IsIndexUsableForReplicaIdentityFull()?

For the specific notes you raised about strategy numbers / operator
classes, I need to
study a bit :) Though, I'll be available to do that early next week.

Thanks,
Onder

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-06-26 08:44:49 Clean up JumbleQuery() from query text
Previous Message Bowen Shi 2023-06-26 07:01:22 Re: Optimize walsender handling invalid messages of 'drop publication'