Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-08 13:09:29
Message-ID: CALDaNm1uAuO2RnER+5GGR4wTUo3U2+ZvS1bfnTndmGmQzUyszg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 3 Mar 2023 at 18:40, Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>
> Hi Vignesh,
>
> Thanks for the review
>
>>
>> 1) We are currently calling RelationGetIndexList twice, once in
>> FindUsableIndexForReplicaIdentityFull function and in the caller too,
>> we could avoid one of the calls by passing the indexlist to the
>> function or removing the check here, index list check can be handled
>> in FindUsableIndexForReplicaIdentityFull.
>> + if (remoterel->replident == REPLICA_IDENTITY_FULL &&
>> + RelationGetIndexList(localrel) != NIL)
>> + {
>> + /*
>> + * If we had a primary key or relation identity with a
>> unique index,
>> + * we would have already found and returned that oid.
>> At this point,
>> + * the remote relation has replica identity full and
>> we have at least
>> + * one local index defined.
>> + *
>> + * We are looking for one more opportunity for using
>> an index. If
>> + * there are any indexes defined on the local
>> relation, try to pick
>> + * a suitable index.
>> + *
>> + * The index selection safely assumes that all the
>> columns are going
>> + * to be available for the index scan given that
>> remote relation has
>> + * replica identity full.
>> + */
>> + return FindUsableIndexForReplicaIdentityFull(localrel);
>> + }
>> +
>
>
> makes sense, done
>
>>
>>
>> 2) Copyright year should be mentioned as 2023
>> diff --git a/src/test/subscription/t/032_subscribe_use_index.pl
>> b/src/test/subscription/t/032_subscribe_use_index.pl
>> new file mode 100644
>> index 0000000000..db0a7ea2a0
>> --- /dev/null
>> +++ b/src/test/subscription/t/032_subscribe_use_index.pl
>> @@ -0,0 +1,861 @@
>> +# Copyright (c) 2021-2022, PostgreSQL Global Development Group
>> +
>> +# Test logical replication behavior with subscriber uses available index
>> +use strict;
>> +use warnings;
>> +use PostgreSQL::Test::Cluster;
>> +use PostgreSQL::Test::Utils;
>> +use Test::More;
>> +
>
>
> I changed it to #Copyright (c) 2022-2023, but I'm not sure if it should be only 2023 or
> like this.
>
>>
>>
>> 3) Many of the tests are using the same tables, we need not
>> drop/create publication/subscription for each of the team, we could
>> just drop and create required indexes and verify the update/delete
>> statements.
>> +# ====================================================================
>> +# Testcase start: SUBSCRIPTION USES INDEX
>> +#
>> +# Basic test where the subscriber uses index
>> +# and only updates 1 row and deletes
>> +# 1 other row
>> +#
>> +
>> +# create tables pub and sub
>> +$node_publisher->safe_psql('postgres',
>> + "CREATE TABLE test_replica_id_full (x int)");
>> +$node_publisher->safe_psql('postgres',
>> + "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
>> +$node_subscriber->safe_psql('postgres',
>> + "CREATE TABLE test_replica_id_full (x int)");
>> +$node_subscriber->safe_psql('postgres',
>> + "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x)");
>>
>> +# ====================================================================
>> +# Testcase start: SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
>> +#
>> +# This test ensures that after CREATE INDEX, the subscriber can automatically
>> +# use one of the indexes (provided that it fulfils the requirements).
>> +# Similarly, after DROP index, the subscriber can automatically switch to
>> +# sequential scan
>> +
>> +# create tables pub and sub
>> +$node_publisher->safe_psql('postgres',
>> + "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
>> +$node_publisher->safe_psql('postgres',
>> + "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
>> +$node_subscriber->safe_psql('postgres',
>> + "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
>
>
> Well, not all the tables are exactly the same, there are 4-5 different
> tables. Mostly the table names are the same.
>
> Plus, the overhead does not seem to be large enough to complicate
> the test. Many of the src/test/subscription/t files follow this pattern.
>
> Do you have strong opinions on changing this?

I felt that once you remove the create publication/subscription/wait
for sync steps, the test execution might become faster and save some
time in the local execution, cfbot and the various machines in
buildfarm. If the execution time will not reduce, then no need to
change.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-03-08 13:11:46 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message Melih Mutlu 2023-03-08 12:47:22 Re: Allow logical replication to copy tables in binary format