Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.
Date: 2023-07-06 01:41:48
Message-ID: CAD21AoB_3JtA0hC2+eVbA_VUe10v6Cj7Qpm3hLC9zpEVXs4nzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 6, 2023 at 7:58 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Wed, Jul 5, 2023 at 4:32 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 5, 2023 at 2:46 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > Hi. Here are some review comments for this patch.
> > > >
> > > > +1 for the patch idea.
> > > >
> > > > ------
> > > >
> > > > I wasn't sure about the code comment adjustments suggested by Amit [1]:
> > > > "Accordingly, the comments atop build_replindex_scan_key(),
> > > > FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
> > > > should also be adjusted."
> >
> > As for IsIndexOnlyOnExpression(), what part do you think we need to
> > adjust? It says:
> >
> > /*
> > * Returns true if the given index consists only of expressions such as:
> > * CREATE INDEX idx ON table(foo(col));
> > *
> > * Returns false even if there is one column reference:
> > * CREATE INDEX idx ON table(foo(col), col_2);
> > */
> >
> > and it seems to me that the function doesn't check if the leftmost
> > index column is a non-expression.
> >
>
> TBH, this IsIndexOnlyOnExpression() function seemed redundant to me,
> otherwise, there can be some indexes that are firstly considered
> "useable" but then fail the subsequent leftmost check. It does not
> seem right.

I see your point. IsIndexUsableForReplicaIdentityFull(), the sole user
of IsIndexOnlyOnExpression(), is also called by
RelationFindReplTupleByIndex() in an assertion build. I thought this
is the reason why we have separate IsIndexOnlyOnExpression() ( and
IsIndexUsableForReplicaIdentityFull()). But this assertion doesn't
check if the leftmost index column exists on the remote relation. What
are we doing this check for? If it's not necessary, we can remove this
assertion and merge both IsIndexOnlyOnExpression() and
IsIndexUsableForReplicaIdentityFull() into
FindUsableIndexForReplicaIdentityFull().

>
> > > > doc/src/sgml/logical-replication.sgml
> > > >
> > > > 1.
> > > > the key. When replica identity <literal>FULL</literal> is specified,
> > > > indexes can be used on the subscriber side for searching the rows.
> > > > Candidate
> > > > indexes must be btree, non-partial, and have at least one column reference
> > > > - (i.e. cannot consist of only expressions). These restrictions
> > > > - on the non-unique index properties adhere to some of the restrictions that
> > > > - are enforced for primary keys. If there are no such suitable indexes,
> > > > + at the leftmost column indexes (i.e. cannot consist of only
> > > > expressions). These
> > > > + restrictions on the non-unique index properties adhere to some of
> > > > the restrictions
> > > > + that are enforced for primary keys. If there are no such suitable indexes,
> > > > the search on the subscriber side can be very inefficient, therefore
> > > > replica identity <literal>FULL</literal> should only be used as a
> > > > fallback if no other solution is possible. If a replica identity other
> > > >
> > > > Isn't this using the word "indexes" with different meanings in the
> > > > same sentence? e.g. IIUC "leftmost column indexes" is referring to the
> > > > ordinal number of the index fields.
> >
> > That was my mistake, it should be " at the leftmost column".
>
> IIUC the subscriber-side table can have more columns than the
> publisher-side table, so just describing in the docs that the leftmost
> INDEX field must be a column may not be quite enough; it also needs to
> say that column has to exist on the publisher-table, doesn't it?

Right. I've updated the patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
doc_update_v2.patch application/octet-stream 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 蔡梦娟 (玊于) 2023-07-06 02:02:15 The same 2PC data maybe recovered twice
Previous Message Anton Kirilov 2023-07-06 00:42:59 Re: Add PQsendSyncMessage() to libpq