Replica Identity check of partition table on subscriber

From: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Replica Identity check of partition table on subscriber
Date: 2022-06-08 08:46:46
Message-ID: OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I saw a problem in logical replication, when the target table on subscriber is a
partitioned table, it only checks whether the Replica Identity of partitioned
table is consistent with the publisher, and doesn't check Replica Identity of
the partition.

For example:
-- publisher --
create table tbl (a int not null, b int);
create unique INDEX ON tbl (a);
alter table tbl replica identity using INDEX tbl_a_idx;
create publication pub for table tbl;

-- subscriber --
-- table tbl (parent table) has RI index, while table child has no RI index.
create table tbl (a int not null, b int) partition by range (a);
create table child partition of tbl default;
create unique INDEX ON tbl (a);
alter table tbl replica identity using INDEX tbl_a_idx;
create subscription sub connection 'port=5432 dbname=postgres' publication pub;

-- publisher --
insert into tbl values (11,11);
update tbl set a=a+1;

It caused an assertion failure on subscriber:
TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088, PID: 1616523)

The backtrace is attached.

We got the assertion failure because idxoid is invalid, as table child has no
Replica Identity or Primary Key. We have a check in check_relation_updatable(),
but what it checked is table tbl (the parent table) and it passed the check.

I think one approach to fix it is to check the target partition in this case,
instead of the partitioned table.

When trying to fix it, I saw some other problems about updating partition map
cache.

a) In logicalrep_partmap_invalidate_cb(), the type of the entry in
LogicalRepPartMap should be LogicalRepPartMapEntry, instead of
LogicalRepRelMapEntry.

b) In logicalrep_partition_open(), it didn't check if the entry is valid.

c) When the publisher send new relation mapping, only relation map cache will be
updated, and partition map cache wouldn't. I think it also should be updated
because it has remote relation information, too.

Attach two patches which tried to fix them.
0001 patch: fix the above three problems about partition map cache.
0002 patch: fix the assertion failure, check the Replica Identity of the
partition if the target table is a partitioned table.

Thanks to Hou Zhijie for helping me in the first patch.

I will add a test for it later if no one doesn't like this fix.

Regards,
Shi yu

Attachment Content-Type Size
v1-0001-Fix-partition-map-cache-issues.patch application/octet-stream 6.6 KB
v1-0002-Check-partition-table-replica-identity-on-subscri.patch application/octet-stream 7.8 KB
backtrace.txt text/plain 1.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-06-08 08:59:38 Re: effective_io_concurrency and NVMe devices
Previous Message Michail Nikolaev 2022-06-08 07:15:19 Re: CPU time for pg_stat_statement