Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Önder Kalacı <onderkalaci(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-03 13:10:33
Message-ID: CACawEhXDxdFSC+FRLomS3u1=i4osSd8SrUj0xB1qvA63hbh+bQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Vignesh,

Thanks for the review

> 1) We are currently calling RelationGetIndexList twice, once in
> FindUsableIndexForReplicaIdentityFull function and in the caller too,
> we could avoid one of the calls by passing the indexlist to the
> function or removing the check here, index list check can be handled
> in FindUsableIndexForReplicaIdentityFull.
> + if (remoterel->replident == REPLICA_IDENTITY_FULL &&
> + RelationGetIndexList(localrel) != NIL)
> + {
> + /*
> + * If we had a primary key or relation identity with a
> unique index,
> + * we would have already found and returned that oid.
> At this point,
> + * the remote relation has replica identity full and
> we have at least
> + * one local index defined.
> + *
> + * We are looking for one more opportunity for using
> an index. If
> + * there are any indexes defined on the local
> relation, try to pick
> + * a suitable index.
> + *
> + * The index selection safely assumes that all the
> columns are going
> + * to be available for the index scan given that
> remote relation has
> + * replica identity full.
> + */
> + return FindUsableIndexForReplicaIdentityFull(localrel);
> + }
> +
>

makes sense, done

>
> 2) Copyright year should be mentioned as 2023
> diff --git a/src/test/subscription/t/032_subscribe_use_index.pl
> b/src/test/subscription/t/032_subscribe_use_index.pl
> new file mode 100644
> index 0000000000..db0a7ea2a0
> --- /dev/null
> +++ b/src/test/subscription/t/032_subscribe_use_index.pl
> @@ -0,0 +1,861 @@
> +# Copyright (c) 2021-2022, PostgreSQL Global Development Group
> +
> +# Test logical replication behavior with subscriber uses available index
> +use strict;
> +use warnings;
> +use PostgreSQL::Test::Cluster;
> +use PostgreSQL::Test::Utils;
> +use Test::More;
> +
>

I changed it to #Copyright (c) 2022-2023, but I'm not sure if it should be
only 2023 or
like this.

>
> 3) Many of the tests are using the same tables, we need not
> drop/create publication/subscription for each of the team, we could
> just drop and create required indexes and verify the update/delete
> statements.
> +# ====================================================================
> +# Testcase start: SUBSCRIPTION USES INDEX
> +#
> +# Basic test where the subscriber uses index
> +# and only updates 1 row and deletes
> +# 1 other row
> +#
> +
> +# create tables pub and sub
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE test_replica_id_full (x int)");
> +$node_publisher->safe_psql('postgres',
> + "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE test_replica_id_full (x int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE INDEX test_replica_id_full_idx ON
> test_replica_id_full(x)");
>
> +# ====================================================================
> +# Testcase start: SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
> +#
> +# This test ensures that after CREATE INDEX, the subscriber can
> automatically
> +# use one of the indexes (provided that it fulfils the requirements).
> +# Similarly, after DROP index, the subscriber can automatically switch to
> +# sequential scan
> +
> +# create tables pub and sub
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
> +$node_publisher->safe_psql('postgres',
> + "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
>

Well, not all the tables are exactly the same, there are 4-5 different
tables. Mostly the table names are the same.

Plus, the overhead does not seem to be large enough to complicate
the test. Many of the src/test/subscription/t files follow this pattern.

Do you have strong opinions on changing this?

> 4) These additional blank lines can be removed to keep it consistent:
> 4.a)
> +# Testcase end: SUBSCRIPTION DOES NOT USE PARTIAL INDEX
> +# ====================================================================
> +
> +
> +# ====================================================================
> +# Testcase start: SUBSCRIPTION DOES NOT USE INDEXES WITH ONLY EXPRESSIONS
>
> 4.b)
> +# Testcase end: Unique index that is not primary key or replica identity
> +# ====================================================================
> +
> +
> +
> +# ====================================================================
> +# Testcase start: SUBSCRIPTION BEHAVIOR WITH ENABLE_INDEXSCAN
>

Thanks, fixed.

Attached v30

Attachment Content-Type Size
v30_0001_use_index_on_subs_when_pub_rep_ident_full.patch application/x-patch 67.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-03-03 13:30:00 Re: Improve WALRead() to suck data directly from WAL buffers when possible
Previous Message Peter Eisentraut 2023-03-03 12:45:11 Re: Allow tailoring of ICU locales with custom rules