Re: Proposal: Conflict log history table for Logical Replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Dilip Kumar <DilipBalaut(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Proposal: Conflict log history table for Logical Replication
Date: 2026-06-08 09:40:20
Message-ID: CAJpy0uBSY7zTH=4TvAOS=kj9vivBUc9NO+Vp6KNw-Na9RiAsMg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

v46-0002:

1)
I was trying to verify TRY-CATCH block of ProcessPendingConflictLogTuple().

When I force InsertConflictLogTuple() to fail while atatching
debugger, I see a new error due to CATCH block. See this dump:

-----------------
[27532] WARNING: could not log conflict to table for subscription
"sub1": cannot open relation "pg_conflict_log_16391"
[27532] ERROR: errstart was not called
[27102] LOG: background worker "logical replication apply worker"
(PID 27532) exited with exit code 1
[27548] LOG: logical replication apply worker for subscription "sub1"
has started
[27548] ERROR: conflict detected on relation "public.tab1":
conflict=insert_exists
[27548] DETAIL: Could not apply remote change: remote row (4).
Key already exists in unique index "tab1_pkey", modified in
transaction 793: key (i)=(4), local row (4).
[27548] CONTEXT: processing remote data for replication origin "pg_16391...
-----------------

'ERROR: errstart was not called' is raised perhaps due to
'FlushErrorState' which sets errordata_stack_depth to -1. If I get rid
of FlushErrorState(), the internal ERROR is not cleared, which results
in the worker exiting (which we are trying to avoid).

-------------------------
[30031] WARNING: could not log conflict to table for subscription
"sub1": cannot open relation "pg_conflict_log_16391"
[30031] ERROR: cannot open relation "pg_conflict_log_16391"
------------->this needs to be handled.
[30031] DETAIL: This operation is not supported for tables.
[30011] LOG: background worker "logical replication apply worker"
(PID 30031) exited with exit code 1
[30043] LOG: logical replication apply worker for subscription "sub1"
has started
[30043] ERROR: conflict detected on relation "public.tab1":
conflict=insert_exists
[30043] DETAIL: Could not apply remote change: remote row (12).
Key already exists in unique index "tab1_pkey", modified in
transaction 872: key (i)=(12), local row (12).
------------------------

I am still thinking how this can be done cleanly. Meanwhile putting it
here for others to review/comment.

2)
Also, I think InsertConflictLogTuple() in the non-error path (via
ReportApplyConflict()) should be wrapped in its own TRY-CATCH block.
When I force an error during that insert, execution falls through to
the start_apply CATCH block, which then attempts to insert the same
conflict record again via ProcessPendingConflictLogTuple(). That
insert fails again for the same reason, causing the apply worker to
error out.

Should we keep this behavior and allow the apply worker to halt on a
CLT insertion failure, or would it be better to avoid disrupting
replication by encapsulating the insertion logic in its own TRY-CATCH
block and handling the issue locally by emiting it as WaRNING?
Thoughts?

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2026-06-08 09:51:44 Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Previous Message Amit Kapila 2026-06-08 09:30:29 Re: pg_createsubscriber: allow duplicate publication names