Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-08 03:15:32
Message-ID: CAA4eK1J_gnrSJOekj577qQy2jV5eYO1VUpQLL1QQVTxeMZd3Sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 7, 2023 at 7:17 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>
>>
>> > > Let me give an example to demonstrate why I thought something is fishy here:
>> > >
>> > > Imagine rel has a (non-default) REPLICA IDENTITY with Oid=1111.
>> > > Imagine the same rel has a PRIMARY KEY with Oid=2222.
>> > >
>
>
> Hmm, alright, this is syntactically possible, but not sure if any user
> would do this. Still thanks for catching this.
>
> And, you are right, if a user has created such a schema, IdxIsRelationIdentityOrPK()
> would return the wrong result and we'd use sequential scan instead of index scan.
> This would be a regression. I think we should change the function.
>
>
> Here is the example:
> DROP TABLE tab1;
> CREATE TABLE tab1 (a int NOT NULL);
> CREATE UNIQUE INDEX replica_unique ON tab1(a);
> ALTER TABLE tab1 REPLICA IDENTITY USING INDEX replica_unique;
> ALTER TABLE tab1 ADD CONSTRAINT pkey PRIMARY KEY (a);
>

You have not given complete steps to reproduce the problem where
instead of the index scan, a sequential scan would be picked. I have
tried to reproduce by extending your steps but didn't see the problem.
Let me know if I am missing something.

Publisher
----------------
postgres=# CREATE TABLE tab1 (a int NOT NULL);
CREATE TABLE
postgres=# Alter Table tab1 replica identity full;
ALTER TABLE
postgres=# create publication pub2 for table tab1;
CREATE PUBLICATION
postgres=# insert into tab1 values(1);
INSERT 0 1
postgres=# update tab1 set a=2;

Subscriber
-----------------
postgres=# CREATE TABLE tab1 (a int NOT NULL);
CREATE TABLE
postgres=# CREATE UNIQUE INDEX replica_unique ON tab1(a);
CREATE INDEX
postgres=# ALTER TABLE tab1 REPLICA IDENTITY USING INDEX replica_unique;
ALTER TABLE
postgres=# ALTER TABLE tab1 ADD CONSTRAINT pkey PRIMARY KEY (a);
ALTER TABLE
postgres=# create subscription sub2 connection 'dbname=postgres'
publication pub2;
NOTICE: created replication slot "sub2" on publisher
CREATE SUBSCRIPTION
postgres=# select * from tab1;
a
---
2
(1 row)

I have debugged the above example and it uses an index scan during
apply without your latest change which is what I expected. AFAICS, the
use of IdxIsRelationIdentityOrPK() is to decide whether we will do
tuples_equal() or not during the index scan and I see it gives the
correct results with the example you provided.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-03-08 03:39:26 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message wangw.fnst@fujitsu.com 2023-03-08 02:54:26 RE: Rework LogicalOutputPluginWriterUpdateProgress