Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Önder Kalacı <onderkalaci(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-06 09:08:20
Message-ID: CACawEhUG9yg3DS+MD=Y2RL--LXUJHh=stfodZ7SVxY0-MXOqpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit, all

> >
> > Given Amit's suggestion on [1], I'm planning to drop this check
> altogether, and
> > rely on table storage parameters.
> >
>
> This still seems to be present in the latest version. I think we can
> just remove this and then add the additional check as suggested by you
> as part of the second patch.
>
>
Now attaching the second patch as well, which implements a new
storage parameter as
you suggested earlier.

I'm open for naming suggestions, I wanted to make the name explicit, so it
is a little long.

I'm also not very familiar with the sgml format. I mostly followed the
existing docs and
built the docs for inspection, but it would be good to have a look into
that part
a little bit further in case there I missed something important etc.

> Few other comments on latest version:
> ==============================
> 1.
> +/*
> + * Returns true if the index is usable for replica identity full. For
> details,
> + * see FindUsableIndexForReplicaIdentityFull.
> + */
> +bool
> +IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
> +{
> + bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
> + bool is_partial = (indexInfo->ii_Predicate != NIL);
> + bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
> +
> + if (is_btree && !is_partial && !is_only_on_expression)
> + {
> + return true;
> ...
> ...
> +/*
> + * Returns the oid of an index that can be used via the apply worker. The
> index
> + * should be btree, non-partial and have at least one column reference
> (e.g.,
> + * should not consist of only expressions). The limitations arise from
> + * RelationFindReplTupleByIndex(), which is designed to handle PK/RI and
> these
> + * limitations are inherent to PK/RI.
>
> By these two, the patch infers that it picks an index that adheres to
> the limitations of PK/RI. Apart from unique, the other properties of
> RI are "not partial, not deferrable, and include only columns marked
> NOT NULL". See ATExecReplicaIdentity() for corresponding checks. We
> don't try to ensure the last two from the list. It is fine to do so if
> we document the reasons for the same in comments or we can even try to
> enforce the remaining restrictions as well. For example, it should be
> okay to allow NULL column values because we anyway compare the entire
> tuple after getting the value from the index.
>

I think this is a documentation issue of this patch. I improved the wording
a bit
more. Does that look better?

I also went over the code / docs to see if we have
any other such documentation issues, I also
updated logical-replication.sgml.

I'd prefer to support NULL values as there is no harm in doing that and it
is
pretty useful feature (we also have tests covering it).

To my knowledge, I don't see any problems with deferrable as we are only
interested in the indexes, not with the constraint. Still, if you see any,
I can
add the check for that. (Here, the user could still have unique index that
is associated with a constraint, but still, I don't see any edge cases
regarding deferrable constraints).

> 2.
> + {
> + /*
> + * This attribute is an expression, and
> + * FindUsableIndexForReplicaIdentityFull() was called earlier
> + * when the index for subscriber was selected. There, the indexes
> + * comprising *only* expressions have already been eliminated.
> + *
> + * Also, because PK/RI can't include expressions we
> + * sanity check the index is neither of those kinds.
> + */
> + Assert(!IdxIsRelationIdentityOrPK(rel, idxrel->rd_id));
>
> This comment doesn't make much sense after you have moved the
> corresponding Assert in RelationFindReplTupleByIndex(). Either we
> should move or remove this Assert as well or at least update the
> comments to reflect the latest code.
>

I think removing that Assert is fine after having a more generic
Assert in RelationFindReplTupleByIndex().

I mostly left that comment so that the meaning of
AttributeNumberIsValid() is easier for readers to follow. But, now
I'm also leaning towards removing the comment and Assert.

>
> 3. When FindLogicalRepUsableIndex() is invoked from
> logicalrep_partition_open(), the current memory context would be
> LogicalRepPartMapContext which would be a long-lived context and we
> allocate memory for indexes in FindLogicalRepUsableIndex() which can
> accumulate over a period of time. So, I think it would be better to
> switch to the old context in logicalrep_partition_open() before
> invoking FindLogicalRepUsableIndex() provided that is not a long-lived
> context.
>
>
>
Hmm, makes sense, that'd avoid any potential leaks that this patch
might bring. Applied your suggestion. Also, looking at the same function
call in logicalrep_rel_open(), that already seems safe regarding leaks. Do
you see any problems with that?

Attached v32. I'll continue replying to the e-mails on this thread with
different
patches. I'm assuming this is easier for you to review such that we have
different
patches for each review. If not, please let me know, I can reply to all
mails
at once.

Thanks,
Onder KALACI

Attachment Content-Type Size
v32_0001_use_index_on_subs_when_pub_rep_ident_full.patch application/octet-stream 67.1 KB
v32_0002_use_index_on_subs_when_pub_rep_ident_full.patch application/octet-stream 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tender wang 2023-03-06 09:30:31 wrong results due to qual pushdown
Previous Message Daniel Gustafsson 2023-03-06 09:02:55 Re: Allow tests to pass in OpenSSL FIPS mode