Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.
Date: 2023-07-05 03:30:41
Message-ID: CAHut+Pv0cLXEKXORsRk=aqg-KszgGTbvwyK_HFCKD7W0n5gzxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

Actually, I thought the FindUsableIndexForReplicaIdentityFull()
function comment is *already* describing the limitation about the
leftmost column (see fragment below), so IIUC the Sawada-san patch is
only trying to expose that same information in the PG docs.

[FindUsableIndexForReplicaIdentityFull comment fragment]
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

~

OTOH, it may be better if these limitation rule details were not
scattered in the code. e.g. build_replindex_scan_key() function
comment can be simplified:

CURRENT:
* This is not generic routine, it expects the idxrel to be a btree, non-partial
* and have at least one column reference (i.e. cannot consist of only
* expressions).

SUGGESTION:
This is not a generic routine. It expects the 'idxrel' to be an index
deemed "usable" by the function
FindUsableIndexForReplicaIdentityFull().

------
doc/src/sgml/logical-replication.sgml

1.
the key. When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows.
Candidate
indexes must be btree, non-partial, and have at least one column reference
- (i.e. cannot consist of only expressions). These restrictions
- on the non-unique index properties adhere to some of the restrictions that
- are enforced for primary keys. If there are no such suitable indexes,
+ at the leftmost column indexes (i.e. cannot consist of only
expressions). These
+ restrictions on the non-unique index properties adhere to some of
the restrictions
+ that are enforced for primary keys. If there are no such suitable indexes,
the search on the subscriber side can be very inefficient, therefore
replica identity <literal>FULL</literal> should only be used as a
fallback if no other solution is possible. If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields. TBH, I am not sure the patch
wording is even describing the limitation in quite the same way as
what the code is actually doing.

HEAD (code comment):
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

HEAD (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions). These
restrictions on the non-unique index properties adhere to some of the
restrictions that are enforced for primary keys.

PATCHED (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference at the leftmost column indexes (i.e. cannot consist
of only expressions). These restrictions on the non-unique index
properties adhere to some of the restrictions that are enforced for
primary keys.

MY SUGGESTION:
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions).
Furthermore, the leftmost field of the candidate index must be a
column of the published table. These restrictions on the non-unique
index properties adhere to some of the restrictions that are enforced
for primary keys.

------
[1] Amit suggestions -
https://www.postgresql.org/message-id/CAA4eK1LZ_A-UmC_P%2B_hLi%2BPbwyqak%2BvRKemZ7imzk2puVTpHOA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-05 04:29:37 pg_upgrade and cross-library upgrades
Previous Message YANG Xudong 2023-07-05 02:15:51 Re: [PATCH] Add loongarch native checksum implementation.