Re: relation OID in ReorderBufferToastReplace error message

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Jeremy Schneider <schnjere(at)amazon(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: relation OID in ReorderBufferToastReplace error message
Date: 2021-09-17 05:23:09
Message-ID: CAA4eK1LraARugiEEpkjJVDXEeTukW1ihG_6=nVEZ1972Z1vTeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 15, 2021 at 6:14 AM Jeremy Schneider <schnjere(at)amazon(dot)com> wrote:
>
> On 7/2/21 18:57, Jeremy Schneider wrote:
>
> The process of trying to understand this recent incident has given me some new insight about what information would be helpful up front in this error message for faster resolution.
>
> First off, and most importantly, the current WAL record we're processing when the error is encountered. I wonder if it could easily print the LSN?
>
> Secondly, the transaction ID. In the specific bug Bertrand found, the problem is actually not with the actual WAL record that's being processed - but rather because previous WAL records in the same transaction left the decoder process in a state where the current WAL record [a commit] generated an error. So it's the entire transaction that needs to be examined to reproduce the error. (Andres actually pointed this out on the original thread back in December 2019.) I realize that once you know the LSN you can easily get the XID with pg_waldump, but personally I'd just as soon include the XID in the error message since I think it will usually be a first step for debugging any problems with WAL decoding. The I can go straight to filtering that XID on my first pg_waldump run.
>

I don't think it is a bad idea to print additional information as you
are suggesting but why only for this error? It could be useful to
investigate any other error we get during decoding. I think normally
we add such additional information via error_context. We have recently
added/enhanced it for apply-workers, see commit [1].

I think here we should just print the relation name in the error
message you pointed out and then work on adding additional information
via error context as a separate patch. What do you think?

[1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=abc0910e2e0adfc5a17e035465ee31242e32c4fc

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-09-17 06:06:02 RE: [BUG] Unexpected action when publishing partition tables
Previous Message houzj.fnst@fujitsu.com 2021-09-17 05:08:36 RE: Column Filtering in Logical Replication