Quick Links

Re: Simplify code building the LR conflict messages

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Álvaro Herrera <alvherre(at)kurilemu(dot)de>
Cc:	Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Simplify code building the LR conflict messages
Date:	2025-11-29 17:30:45
Message-ID:	666155.1764437445@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre(at)kurilemu(dot)de> writes:
> So, what we're doing here is to append further row-identifying details
> to an errdetail string that already contains some explanation of the
> problem. That is, we have something like
> DETAIL: The row to be updated was deleted.
> and then we add whatever this function produces, after a newline. I
> don't remember any other place in our code that does such a thing. The
> result would look something like
> DETAIL: The row to be updated was deleted.
> Key: (1, 2); remote row (1,2,3); replica identity full (1,2,3).
> or something like that, where each of the parts can begin with uppercase
> or not, with a semicolon or not, depending on whether there are any
> previous parts.

I looked into the postmaster log for subscription/t/035_conflicts.pl
to find actual examples of what this code produces, and now I think
it's got worse problems than whether the C code style is good.
The output is frankly a mess, and IMO it violates our message style
guidelines in multiple ways. Here's a sampling:

2025-11-29 11:43:03.782 EST logical replication apply worker[659868] ERROR: conflict detected on relation "public.conf_tab": conflict=multiple_unique_conflicts
2025-11-29 11:43:03.782 EST logical replication apply worker[659868] DETAIL: Key already exists in unique index "conf_tab_pkey", modified in transaction 770.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_b_key", modified in transaction 770.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_c_key", modified in transaction 770.
Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).

2025-11-29 11:43:04.654 EST logical replication apply worker[659907] LOG: conflict detected on relation "public.conf_tab": conflict=update_missing
2025-11-29 11:43:04.654 EST logical replication apply worker[659907] DETAIL: Could not find the row to be updated.
Remote row (6, 7, 8); replica identity (a)=(5).

2025-11-29 11:43:05.266 EST logical replication apply worker[659942] LOG: conflict detected on relation "public.tab": conflict=delete_origin_differs
2025-11-29 11:43:05.266 EST logical replication apply worker[659942] DETAIL: Deleting the row that was modified locally in transaction 790 at 2025-11-29 11:43:05.262807-05.
Existing local row (1, 3); replica identity (a)=(1).

Where to begin?

* "conflict detected" is already presuming way too much context for
someone who's looking into the postmaster log. Here we can guess
what it's about because somebody configured this test with %b in
log_line_prefix. But without the "logical replication apply worker"
cue, are users really going to know what these messages are about?

* The business about "conflict=some_kind_of_keyword" is certainly not
per our style guidelines, and I wonder how it translates. (Checks
code ... looks like we don't even try.)

* The DETAIL messages are not complete sentences or even
approximations to them, and frankly they are just about unintelligible.

* I follow "Key already exists in unique index "foo" but not "modified
in transaction 770" --- what was modified, which side's XID is this,
and why do we think an XID will be of any help whatever? The run-on
clause is very poor English in any case. "Deleting the row that was
modified locally in transaction 790 at 2025-11-29 11:43:05.262807-05."
is better in that a transaction timestamp is much more likely to be
useful than an XID, and it's marginally better English, but it's still
not a complete sentence and it reads like context not detail.

* What am I to make of this:
"Key (a)=(55); existing local row (55, 2, 3); remote row (55, 2, 3)."
From context I guess it's trying to show me the entire rows not just
the duplicated key value, but they're not labeled adequately.
Besides, I wonder how well this output scales to non-toy tables where
the rows are kilobytes or megabytes long. Index keys can be safely
assumed to not be enormously long, but it doesn't follow that it's
sane to emit the entire row. Much less several of them in one
error message.

* The three examples I show above look like the detail messages were
written by three different people who were not talking to each other.
The initial kind-of-a-sentence part is totally different style in
each case. And why do some of the outputs put the labeled key first
and others put it last?

So I think this area desperately needs significant editorial
attention, as well as some fundamental rethinking of just what
information we should show. Perhaps using errcontext would help,
but I'm not sure. I think a large part of the problem stems from
trying to cram multiple error conditions into one ereport ... is it
possible to avoid that?

regards, tom lane

In response to

Re: Simplify code building the LR conflict messages at 2025-11-28 13:08:33 from Álvaro Herrera

Responses

Re: Simplify code building the LR conflict messages at 2025-11-29 18:21:00 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Álvaro Herrera	2025-11-29 17:48:55	Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY
Previous Message	Peter Eisentraut	2025-11-29 15:36:41	Re: Migrate to autoconf 2.72?