Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: Zane Duffield <duffieldzane(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load
Date: 2025-04-24 06:25:13
Message-ID: CANhcyEXNGYzFe+JqUNZh0tT0aBV-0HOWJrcUYrUyduzFsy-Xeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 23 Apr 2025 at 08:43, Zane Duffield <duffieldzane(at)gmail(dot)com> wrote:
>
> Hi Euler, thanks for your reply.
>
> On Wed, Apr 23, 2025 at 11:58 AM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>>
>> On Wed, Apr 16, 2025, at 8:14 PM, PG Bug reporting form wrote:
>>
>> I'm in the process of converting our databases from pglogical logical
>> replication to the native logical replication implementation on PostgreSQL
>> 17. One of the bugs we encountered and had to work around with pglogical was
>> the plugin dropping records while converting to a streaming replica to
>> logical via pglogical_create_subscriber (reported
>> https://github.com/2ndQuadrant/pglogical/issues/349) I was trying to
>> confirm that the native logical replication implementation did not have this
>> problem, and I've found that it might have a different problem.
>>
>>
>> pg_createsubscriber uses a different approach than pglogical. While pglogical
>> uses a restore point, pg_createsubscriber uses the LSN from the latest
>> replication slot as a replication start point. The restore point approach is
>> usually suitable to physical replication but might not cover all scenarios for
>> logical replication (such as when there are in progress transactions). Since
>> creating a logical replication slot does find a consistent decoding start
>> point, it is a natural choice to start the logical replication (that also needs
>> to find a decoding start point).
>>
>> I should say that I've been operating under the assumption that
>> pg_createsubscriber is designed for use on a replica for a *live* primary
>> database, if this isn't correct then someone please let me know.
>>
>>
>> pg_createsubscriber expects a physical replica that is preferably stopped
>> before running it.
>
>
> I think pg_createsubscriber actually gives you an error if the replica is not stopped. I was talking about the primary.
>
>>
>> Your script is not waiting enough time until it applies the backlog. Unless,
>> you are seeing a different symptom, there is no bug.
>>
>> You should have used something similar to wait_for_subscription_sync routine
>> (Cluster.pm) before counting the rows. That's what is used in the
>> pg_createsubscriber tests. It guarantees the subscriber has caught up.
>>
>
> It may be true that the script doesn't wait long enough for all systems, but when I reproduced the issue on my machine(s) I confirmed that the logical decoder process was properly stuck on a conflicting primary key, rather than just catching up.
>
> From the log file
>>
>> 2025-04-16 09:17:16.090 AEST [3845786] port=5341 ERROR: duplicate key value violates unique constraint "test_table_pkey"
>> 2025-04-16 09:17:16.090 AEST [3845786] port=5341 DETAIL: Key (f1)=(20700) already exists.
>> 2025-04-16 09:17:16.090 AEST [3845786] port=5341 CONTEXT: processing remote data for replication origin "pg_24576" during message type "INSERT" for replication target relation "public.test_table" in transaction 1581, finished at 0/3720058
>> 2025-04-16 09:17:16.091 AEST [3816845] port=5341 LOG: background worker "logical replication apply worker" (PID 3845786) exited with exit code 1
>
>
> wait_for_subscription_sync sounds like a better solution than what I have, but you might still be able to reproduce the problem if you increase the sleep interval on line 198.
>
> I wonder if Shlok could confirm whether they found the conflicting primary key in their reproduction?
>

I have analysed the issue and found that this issue appears when
'consistent_lsn' is set to a COMMIT lsn during pg_createsubscriber is
run.

pg_waldump on primary node:
rmgr: Standby len (rec/tot): 102/ 102, tx: 0, lsn:
0/06AEB2B0, prev 0/06AEB270, desc: RUNNING_XACTS nextXid 1128
latestCompletedXid 1114 oldestRunningXid 1115; 13 xacts: 1121 1118
1115 1116 1117 1119 1120 1124 1125 1122 1126 1123 1127
rmgr: Transaction len (rec/tot): 46/ 46, tx: 1116, lsn:
0/06AEB318, prev 0/06AEB2B0, desc: COMMIT 2025-04-17 19:22:22.789463
IST

logs for replica node:
2025-04-17 19:22:34.759 IST [2410569] port=5341 LOG: logical
replication apply worker for subscription "sub" has started
2025-04-17 19:22:34.831 IST [2410569] port=5341 ERROR: duplicate key
value violates unique constraint "test_table_pkey"
2025-04-17 19:22:34.831 IST [2410569] port=5341 DETAIL: Key
(f1)=(402000) already exists.
2025-04-17 19:22:34.831 IST [2410569] port=5341 CONTEXT: processing
remote data for replication origin "pg_24576" during message type
"INSERT" for replication target relation "public.test_table" in
transaction 1116, finished at 0/6AEB318
2025-04-17 19:22:34.835 IST [2410562] port=5341 LOG: background
worker "logical replication apply worker" (PID 2410569) exited with
exit code 1

logs for primary node:
2025-04-17 19:22:34.788 IST [2410570] port=5340 ubuntu(at)test_db/sub
DETAIL: Waiting for transactions (approximately 8) older than 1115 to
end.
2025-04-17 19:22:34.788 IST [2410570] port=5340 ubuntu(at)test_db/sub
STATEMENT: START_REPLICATION SLOT "sub" LOGICAL 0/6AEB318
(proto_version '4', origin 'any', publication_names '"pub"')
2025-04-17 19:22:34.825 IST [2410570] port=5340 ubuntu(at)test_db/sub
LOG: logical decoding found consistent point at 0/6AEB2B0
2025-04-17 19:22:34.825 IST [2410570] port=5340 ubuntu(at)test_db/sub
DETAIL: There are no old transactions anymore.
2025-04-17 19:22:34.825 IST [2410570] port=5340 ubuntu(at)test_db/sub
STATEMENT: START_REPLICATION SLOT "sub" LOGICAL 0/6AEB318
(proto_version '4', origin 'any', publication_names '"pub"')
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu(at)test_db/sub
LOG: could not send data to client: Connection reset by peer
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu(at)test_db/sub
CONTEXT: slot "sub", output plugin "pgoutput", in the change
callback, associated LSN 0/695BAA8
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu(at)test_db/sub
STATEMENT: START_REPLICATION SLOT "sub" LOGICAL 0/6AEB318
(proto_version '4', origin 'any', publication_names '"pub"')
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu(at)test_db/sub
LOG: released logical replication slot "sub"

logs for the pg_createsubscriber run:
2025-04-17 19:22:34.361 IST [2410552] port=5341
ubuntu(at)test_db/pg_createsubscriber LOG: statement: SELECT
pg_catalog.pg_replication_origin_advance('pg_24576', '0/6AEB318')
2025-04-17 19:22:34.361 IST [2410552] port=5341
ubuntu(at)test_db/pg_createsubscriber LOG: duration: 0.640 ms
2025-04-17 19:22:34.362 IST [2410552] port=5341
ubuntu(at)test_db/pg_createsubscriber LOG: statement: ALTER SUBSCRIPTION
"sub" ENABLE

Here '0/06AEB318' is a COMMIT lsn. During the run of
pg_createsubscriber, the 'recovery_target_lsn' of standby is set to
'0/06AEB318'. So, after restart of standby it will have the records
for the transaction committed at '0/06AEB318'.
Now after subscription is created, replication origin is advanced to
lsn '0/06AEB318'. Now when this subscription is enabled, the
applyworker tries to apply this record again and thus gives a primary
key conflict.

In case of normal logical replication, the replication origin is
advanced to the next record to that of COMMIT lsn (in code it is
referred as end_lsn).

The occurrence of this failure is rare because it happens only when
the 'consistent_lsn' is set to a lsn which refers to a COMMIT record.

Thanks and Regards,
Shlok Kyal

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Shlok Kyal 2025-04-24 06:27:20 Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load
Previous Message 李园园 2025-04-24 06:04:59 DSA refcnt overflow in pg_stat/could not attach to dynamic shared area