Problem with logical replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Problem with logical replication
Date: 2020-04-16 08:48:22
Message-ID: CAFiTN-u64S5bUiPL1q5kwpHNd0hRnf1OE-bzxNiOs5zo84i51w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While try to setup a cascading replication, I have observed that if we
set the REPLICA IDENTITY to FULL on the subscriber side then there is
an Assert hit.

After analysis I have found that, when we set the REPLICA IDENTITY to
FULL on subscriber side (because I wanted to make this a publisher for
another subscriber).
then it will set relation->rd_replidindex to InvalidOid refer below code snippet
RelationGetIndexList()
{
....
if (replident == REPLICA_IDENTITY_DEFAULT && OidIsValid(pkeyIndex))
relation->rd_replidindex = pkeyIndex;
else if (replident == REPLICA_IDENTITY_INDEX && OidIsValid(candidateIndex))
relation->rd_replidindex = candidateIndex;
else
relation->rd_replidindex = InvalidOid;
}

But, while appying the update and if the table have an index we have
this assert in build_replindex_scan_key

static bool
build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
TupleTableSlot *searchslot)
{
...
Assert(RelationGetReplicaIndex(rel) == RelationGetRelid(idxrel));
}

To me it appears like this assert is not correct. Attached patch has
removed this assert and things works fine.

#0 0x00007ff2a0c8d5d7 in raise () from /lib64/libc.so.6
#1 0x00007ff2a0c8ecc8 in abort () from /lib64/libc.so.6
#2 0x0000000000aa7c7d in ExceptionalCondition (conditionName=0xc1bb30
"RelationGetReplicaIndex(rel) == RelationGetRelid(idxrel)",
errorType=0xc1bad9 "FailedAssertion",
fileName=0xc1bb1c "execReplication.c", lineNumber=60) at assert.c:67
#3 0x00000000007153c3 in build_replindex_scan_key
(skey=0x7fff25711560, rel=0x7ff2a1b0b800, idxrel=0x7ff2a1b0bd98,
searchslot=0x21328c8) at execReplication.c:60
#4 0x00000000007156ac in RelationFindReplTupleByIndex
(rel=0x7ff2a1b0b800, idxoid=16387, lockmode=LockTupleExclusive,
searchslot=0x21328c8, outslot=0x2132bb0) at execReplication.c:141
#5 0x00000000008aeba5 in FindReplTupleInLocalRel (estate=0x2150170,
localrel=0x7ff2a1b0b800, remoterel=0x214a7c8, remoteslot=0x21328c8,
localslot=0x7fff25711f28) at worker.c:989
#6 0x00000000008ae6f2 in apply_handle_update_internal
(relinfo=0x21327b0, estate=0x2150170, remoteslot=0x21328c8,
newtup=0x7fff25711fd0, relmapentry=0x214a7c8) at worker.c:820
#7 0x00000000008ae609 in apply_handle_update (s=0x7fff25719560) at worker.c:788
#8 0x00000000008af8b1 in apply_dispatch (s=0x7fff25719560) at worker.c:1362
#9 0x00000000008afd52 in LogicalRepApplyLoop (last_received=22926832)
at worker.c:1570
#10 0x00000000008b0c3a in ApplyWorkerMain (main_arg=0) at worker.c:2114
#11 0x0000000000869c15 in StartBackgroundWorker () at bgworker.c:813
#12 0x000000000087d28f in do_start_bgworker (rw=0x20a07a0) at postmaster.c:5852
#13 0x000000000087d63d in maybe_start_bgworkers () at postmaster.c:6078
#14 0x000000000087c685 in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5247
#15 <signal handler called>
#16 0x00007ff2a0d458d3 in __select_nocancel () from /lib64/libc.so.6
#17 0x0000000000878153 in ServerLoop () at postmaster.c:1691
#18 0x0000000000877b42 in PostmasterMain (argc=3, argv=0x2079120) at
postmaster.c:1400
#19 0x000000000077f256 in main (argc=3, argv=0x2079120) at main.c:210

To reproduce this issue
run start1.sh
then execute below commands on publishers.
insert into pgbench_accounts values(1,2);
update pgbench_accounts set b=30 where a=1;

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
start1.sh text/x-sh 938 bytes
bugfix_replica_identity_full_on_subscriber.patch application/octet-stream 581 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message yuzuko 2020-04-16 09:16:45 Re: Autovacuum on partitioned table (autoanalyze)
Previous Message Richard Guo 2020-04-16 08:44:10 Re: sqlsmith crash incremental sort